Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Intro to Machine Learning by Microsoft Ventures

3,081 views

Published on

Machine learning lets you make better business decisions by uncovering patterns in your consumer behavior data that is hard for the human eye to spot. You can also use it to automate routine, expensive human tasks that were previously not doable by computers. In the business to business space (B2B), if your competitors can make wiser business decisions based on data and automate more business operations but you still base your decisions on guesswork and lack automation, you will lose out on business productivity. In this introduction to machine learning tech talk, you will learn how to use machine learning even if you do not have deep technical expertise on this technology.

Topics covered:
1.What is machine learning
2.What is a typical ML application architecture
3.How to start ML development with free resource links
4.Key decision factors in ML technology selection depending on use case scenarios

Published in: Technology, Education
  • You can try to use this service ⇒ www.HelpWriting.net ⇐ I have used it several times in college and was absolutely satisfied with the result.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! I have searched hard to find a reliable and best research paper writing service and finally i got a good option for my needs as ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Get the best essay, research papers or dissertations. from ⇒ www.HelpWriting.net ⇐ A team of professional authors with huge experience will give u a result that will overcome your expectations.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I'd advise you to use this service: ⇒ www.WritePaper.info ⇐ The price of your order will depend on the deadline and type of paper (e.g. bachelor, undergraduate etc). The more time you have before the deadline - the less price of the order you will have. Thus, this service offers high-quality essays at the optimal price.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Intro to Machine Learning by Microsoft Ventures

  1. 1. Introduction To Machine Learning Chun Ming Chin Microsoft Ventures - @MSFTVentures Chun Ming Chin - @chinchunming (Machine Learning Workshop by Microsoft Ventures)
  2. 2. Objectives •Understand why is machine learning important •Learn how to apply machine learning in your use case •Adopt best practices in machine learning
  3. 3. Overview • Why Machine Learning • What is Machine Learning • Frame common tasks as machine learning problems • Example 1: Mobile Optical Character Recognition on Asian text • Example 2: Predict Housing Rental Prices • Accuracy Issues (i.e. Generalization) • Solutions to Generalization • Putting it all together: Machine learning in stock trading • Machine Learning Best Practices
  4. 4. Why Machine Learning? 1. Increase barrier to entry when product/service quality is dependent on data Product/ Service Users Data Increase quality/quantity
  5. 5. Why Machine Learning? 2. Automate human operations to increase productivity and lower cost • Example: Auto identify and ban bots that sign up on your website • Use Case: Consider rules based approach first. When tasks cannot be completed with specific rules, then use ML. _______@______.com
  6. 6. Why Machine Learning? 3. Customize product/service to increase engagement and profits • Examples: Customize sales page to increase conversion rates for online information product A B C D E F A B C D E F
  7. 7. Machine Learning Example Chinese Traditional (Sophisticated and big) Japanese (Squiggly and cute) か き け こ さ し す せ 婆 魔 佛 特 級 氣 喜 歡 Features in 2D space Chinese Japanese No.ofblackpixelsinimage No. of straight lines in image Goal: Get computer to classify input image as Chinese or Japanese. • Features: Characteristics of the image/ measurements from data e.g. No. of black pixels/ orientation of strokes on images
  8. 8. Label 2 Label 1 ML Terminology • Data point = Sample = Example • Labels/ Classes/ Categories: • Discrete (e.g. Optical Character Recognition) • Continuous (e.g. Housing prices) • Classification/decision boundary: • Separates regions of feature space • Hopefully helps separate different classes Features in 2D space Indexiofdatapoint Feature dimension j 𝑉𝑉 = Label Data Point Decision boundary
  9. 9. What is Machine Learning? Unsupervised learning • Algos that operate on unlabelled examples • Discover structure/ patterns in the data. Supervised Learning • Algos trained on labelled examples • Predict an output for previously unseen inputs.
  10. 10. Supervised Machine Learning (Classification) Measurements (features) & associated class labels Training Data Set Training stage (Usually offline) Training algorithm 𝑓𝑓 𝑥𝑥 Structure + Parameters Learned Model Input Test Data Point Measurements (features) only 𝑓𝑓 𝑥𝑥 Predicted Class Label Testing stage (Run time, online)
  11. 11. Mobile Optical Character Recognition of Asian Text Input test images Image classification Me Competition Expense Middle Classifier 𝑓𝑓 𝑥𝑥 Image Measurements:: Orientation of strokes in image/ spatial position of pixels Use Case: Scale up a product concept with trade off on accuracy.
  12. 12. Supervised Machine Learning (Regression) Measurements (features) & associated continuous labels Training Data Set Training stage Training algorithm 𝑓𝑓 𝑥𝑥 Structure + Parameters Learned Model Input Test Data Point Measurements (features) only 𝑓𝑓 𝑥𝑥 Testing stage Continuous value Output
  13. 13. Example: Predict rental prices based on house area (Sq ft) Training stage 1. Raw Input Data: 2. Use training algorithm from Python’s ML library. 3. Get resulting ML model 𝑓𝑓 𝑥𝑥 Use Case: Make predictions based on historical data
  14. 14. Case Study: Predict rental prices based on house area (Sq ft) Rental Price ($) Feature: Area Regressor 𝑓𝑓 𝑥𝑥 Cheap Expensive Testing stage 1800 Sqft
  15. 15. RentalPrice($) Area (Square Feet) Optimization with Objective Function Iteration 1 Iteration 2 Iteration 3 Iteration 4
  16. 16. RentalPrice($) Area (Square Feet) Generalization Issues Legend Test data point Train data point
  17. 17. Generalization Issues Under fitting: • Number of features used is too small • There are patterns in the data that algorithm is unable to fit Over fitting: • Number of features used is too large • Fitting serious patterns in the training data set rather than capture true underlying trends
  18. 18. Under fitting in Regression Over fitting in Regression Under/Over Fitting in House Rental Prices Prediction
  19. 19. Under fitting in Classification Over fitting in Classification Under/Over Fitting in Optical Character Recognition Outlier
  20. 20. Generalization Over fitUnder fit Test Error Best generalization No. of iterations Error Training Error
  21. 21. Fixes for ML algorithms Solutions to accuracy issues prioritized descending order of sensitivity to classification error: 1. Training data improvement • Get more training examples (Fixes over fitting) • Ensure training data is high quality (De-noise training data) 2. Modify objective function 3. Feature engineering • Increase/reduce number of features (Fixes under/over fitting) • Change features used (Fixes under fitting) 4. Optimization algorithm • Change the ML model used (SVMs , Decision trees, neural network, etc.) • Run optimization algo for more iterations to ensure it converges
  22. 22. Solution: Increase amount of training data Test Error Best generalization No. of iterations Error Training Error Test Error Best generalization No. of iterations Error Training Error Before After
  23. 23. Feature Engineering: Increase number of features • Combine features: 𝑥𝑥3 = 𝑥𝑥1x 𝑥𝑥2 • Convert continuous features into categorical features (i.e. Bucketize feature values) • Create a new feature as an indicator for missing values in another feature and supply a default value to the missing feature value. • To address non-linearly separable data, use non-linear features (e.g. For original feature x, add derived feature 𝑥𝑥2 . But not advisable beyond degree 2).
  24. 24. Feature Engineering: Reduce number of features Reduce no. of features and identify most important features • Data is more compact and dense • Can train and classify faster • Can improve accuracy
  25. 25. Filter out stroke orientation information along 0, 45, 90 and 135 degrees. Extract Feature Feature vector dimension: 5 (No. of sub blocks per row) x 5 (No. of sub blocks per column) x 2 (Calculate avg & var per sub block) x 4 (No. of orientations) = 200 Feature Engineering Example: Asian Optical Character Recognition • Use domain knowledge to choose features that distinguishes different classes well • Read academic papers to understand prior work in the field
  26. 26. Solution: Try different ML models (i.e. Optimization algorithms) Decision Factors • Ease of training/ testing • Ease of debugging • Model size (Memory constraints) • Accuracy/ generalization potential • Data characteristics (e.g. For non-linearly separable data, use non-linear models)
  27. 27. Support Vector Machines (SVM) Why: • Linear and non-linear classification • Best for binary (i.e. 2 class) classification though can be used for multi-class scenario • Easy to train • Guaranteed global optimum • Scales well to high dimensional data What: • Find decision boundary that maximizes margin between classes. Boundary only determined by nearby data points Support Vectors
  28. 28. Support Vector Machines (SVM) Choosing correct kernel function for non linear SVM (Original video from Udi Aharoni here: https://www.youtube.com/watch?v=3liCbRZPrZA) 1. Find non linear boundary that separates blue from red data points in 2D space 2. Map data points into 3D space using polynomial kernel. 3. Linearly separate data points in 3D space using a plane. 4. Map back to 2D space to determine non linear boundary.
  29. 29. Decision Trees Why: • Non linear classification & regression • Pros: • Easy to understand and debug • Finds most important features in data • Requires little data preparation • Cons: • Memory concerns limit accuracy of decision trees. Deeper trees, higher test accuracy. But trees grow exponentially with depth What: • Partition feature space into smaller pieces • Learn tree structure and split functions Root Node C = 11 J = 1 𝑥𝑥1 > 140𝑥𝑥1 < 140 𝑥𝑥2 > 140𝑥𝑥2 < 140 C = 3 J = 13 C = 2 J = 1 C = 1 J = 12 Split function
  30. 30. Decision Forests Why: • Solves instability in decision trees - Small variations in data can generate a different tree • Improves memory-accuracy tradeoff as trees can be parallelized. What: • A collection (i.e. Ensemble) of trees • Aggregate predictions across all trees.
  31. 31. (Deep) Neural Networks Why: • For non linear classification & regression • Pros: • Fast in testing stage • Robust to noise • Cons: • Slow in training stage • Only guarantees local optima What • Sequence of non-linear combinations of extremely simple computations with high connectivity Input layer Output layer Hidden (Deep) Layers
  32. 32. Model selection: Use Cross Validation Indexiofdatapoint Feature Dimension j Label Validation Set Training Set Held Out Cross Validation 1. Randomly split all data into 2 subsets: • Training set (70%) • Validation set (30%) 2. Train machine learning model on training set. 3. Pick model with lowest error on validation set. K Fold Cross Validation 1. Divide data into K pieces 2. Train on K – 1 pieces 3. Validate on remaining piece 4. Average over K results to get generalization error of model
  33. 33. 91.66% 93.10% 21.40% 89.00% 89.00% 92.70% 75.00% 87.04% 0.02% 35.96% 35.96% 85.92% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Nearest Neighbor Linear SVM Sigmoid SVM Polynomial SVM RBF SVM Intersection SVM Asian Optical Character Recognition Validation Results With reduced feature dimension Without reduced feature dimension
  34. 34. Putting it all together: Machine Learning in Stock Trading Use case: Machines see patterns in big data faster and perform tasks faster than humans.
  35. 35. Case Study: Machine Learning in Stock Trading • Goal: Create profitable trading strategy • Data: Use company (e.g. United Airlines) stock data from Wharton Research Data Services (WRDS) https://wrdsweb.wharton.upenn.edu/wrds/ • Implementation: Predict closing price on next time step based on information from current time step *Disclaimer: This may not make money
  36. 36. Case Study: Machine Learning in Stock Trading Training stage Training algorithm 𝑓𝑓 𝑥𝑥 Structure + Parameters Learned Model
  37. 37. Case Study: Machine Learning in Stock Trading Buy Trade price at next time step > Trade price of the current time step (i.e. positive returns). Label = +1. Hold Trade price at next time step = Trade price of current time step. Label 0. Sell Trade price at next time step < Trade price of the current time step (i.e. negative returns). Label = -1. Measurements: Average trade price per sec, standard deviation of trade price per sec Trade Price Testing stage Classifier 𝑓𝑓 𝑥𝑥
  38. 38. Training Data Improvements o Terminology • Trade price: Price at which shares last traded hands • Bid price: Price a buyer is willing to pay • Bid size: No. of shares available at bid price • Offer price: Price a seller is looking to get • Offer size: No. of shares available at offer price o Intuition… trade price next second should depend on the bid-offer information now. o Add bid and offer price data (That is within 5% of trade price) to training data
  39. 39. Feature Engineering Time = 9:30:03 am Time = 9:30:04 am Time = 9:30:05 am United Airlines Inc. (Ticker Symbol UAL) Date: 2011 Dec 01 • Extract measurements from the distribution of the bid-offer curve at each second window of the bid-offer curve at each second window. Bid Offer Bid Offer Bid Offer
  40. 40. Machine Learning Best Practices Combine human intuition/ wisdom with machine speed/ pattern recognition. • Use domain knowledge to choose features that distinguishes different classes well • Add specific rules to process input data before training stage/ output data after test stage. (In contrast with generalized rules from machine learning) Reduces training time when data is pre processed instead of letting ML model learn the patterns eventually.
  41. 41. Useful Machine Learning Tools • Python installer for Windows with all necessary ML libraries (e.g., SciPy, NumPy, etc.) http://winpython.sourceforge.net/ • http://www.r-project.org/ R project for statistical computing • http://prediction.io/ Create predictive features, such as personalization, recommendation and content discovery • http://www.tableausoftware.com/ Enables you to visually analyze your data
  42. 42. Unsupervised Learning Application Scenarios (e.g. MinHash Clustering, Matrix Factorization, Dimensionality Reduction) 1. Simplify your data so as to provide insights for 3rd party businesses. 2. Interpret data to test assumptions about your user’s behavior/ market 3. Cluster your customers into different groups with different needs so as to increase monetization. 4. Make more informed business decisions for your startup/ customers.
  43. 43. More Best Machine Learning Practices • Having simple models with clear explanations is better than complicated moel with unclear explanations for debugging purposes. • To address non-linearly separable data: Use non linear features/ classifiers. • Don’t confuse cause and effect with noise from your data. Consider using certain statistical tests (e.g. McNemmar’s statistics) to check whether your result is statistically significant. • If your training data is too small, it can cause over fitting problem. • Modularize machine learning model so that it is easy for new people to see conveniently and experiment easily. • Have a common baseline when comparing improvements to your machine learned model. Common baselines enable you to share resources when comparing between different techniques. • Modularize your code such that is easy for new people to experiment with different techniques (Parameter sweeping etc.) quickly. Define inputs/ outputs/ training parameters clearly. • Compare to natural baselines. Guess global average for items ratings. Suggest globally popular items. • You can use UI/ UX to your advantage to find hacks around the balance issue of computational speed (Latency) and memory capacity. • Incrementally update your ratings using Stochastic Gradient descent. (i.e. As I get new observations, I’ll update for that user and item only). An alternative is weekly batch retraining. • The more expressive your model, the less expressive your features need to be. The less expressive your model, the more expressive your features need to be.
  44. 44. • Think about scaling early: 1. Sample a subset of ratings for each user so that you can handle the matrix in memory. 2. Use MinHash to cluster users (DDGR07) 3. Distribute calculations with Map Reduce 4. Distribute matrix operations with Map Reduce [GHNS11] 5. Parallelize stochastic gradient descent [ZWSL10] 6. Expectation-maximization for pLSI with MapReduce [DDGR07] Note: Niche vs general – tf-idf. Both of us watching a niche movie should mean more than if I watch a popular movie. For practical constraints, people use item based similarity very often. More Best Machine Learning Practices

×