Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Discover the Potential of your Data with Machine Learning


Published on

We have moved from the age where organizations had to jump through hoops to get relevant data about their customers, business segments, market, etc. With the onset of massive digitization across the business world, getting and collecting information is no longer a challenge, but processing them into meaningful insights is.

Enterprises in this competitive business landscape are focusing towards an integrated data driven approach to enable rapid and accurate decision making. Machine learning, no longer lying in the realms of sci-fi, with its multi-faceted applications and benefits is emerging as a front-runner to overcome this challenge.

Enterprises implementing machine learning can easily get new market insights, predict customer behavior or preferences, and reduce operating costs by improving the effectiveness and efficiency of the business processes. Thus, resulting in overall improvement in both top line and bottom line margins.

"Thank you for joining us for an insightful webinar on “Discover the Potential of Your Data with Machine Learning”. Attendees got insights on how to use machine learning on enterprise data, along with the tools and technologies needed and some interesting real world use cases."

Published in: Technology
  • Be the first to comment

Discover the Potential of your Data with Machine Learning

  1. 1. Discover the Potential of Your Data with Machine Learning
  2. 2. Housekeeping • Webinar recordings and slides will be shared with all attendees • Type in your questions and comments using the question pane on the right hand side © Harbinger Systems |
  3. 3. Presenters © Harbinger Systems | Lalit Kumar Business Analyst Harbinger Systems Gautam Mainkar Data Analyst Harbinger Systems
  4. 4. Agenda • A Practical definition • Why its important • Using machine learning on enterprise data – Types of business problems machine learning can solve – How to categorize a problem- Regression, Clustering and Classification • Overview of key algorithms, tools and technologies • Walk-through of real-world use cases © Harbinger Systems |
  5. 5. Machine Learning (ML) – A Practical Definition A type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed. • Computer can infer rules inherent in data • Computer adapts when exposed to new data © Harbinger Systems |
  6. 6. Why we Need it? © Harbinger Systems | Comic by XKCD
  7. 7. Enterprise Data Hides Information “There are things we know we know, There are things we know we don't know. But there are also things we don't know we don't know” - Donald Rumsfeld © Harbinger Systems |
  8. 8. What Constitutes a Machine Learning Problem? © Harbinger Systems | Emphasis of machine learning is on automatic methods Devise learning algorithms that do the learning automatically without human intervention Program by example: we don't care what the machine does, as long as it does it right Result-oriented rather than process- oriented
  9. 9. How can Machine Learning Add Value? © Harbinger Systems | ML is a data driven approach • Business knowledge isn’t necessary ML is domain independent • Same algorithms can be used across domains and in different use cases ML creates flexible decision systems • Creates robust systems that can adjust for changing systems without human intervention
  10. 10. ML and Big Data ML thrives with big data! – Accuracy of algorithms increases with size of data – Statistical approaches can treat big datasets much better than traditional paradigms – Decision making using ML can adapt to transactional data much better © Harbinger Systems | Machine Learning Big Data
  11. 11. Fraud Detection: Did the user really do this login/make this purchase? Product Recommendation: Will the user like this product? Stock Trading: Will the stock go up or down? Medical Diagnosis: Given some symptoms, what is the patient suffering from? © Harbinger Systems | Machine Learning Applications- Some Examples
  12. 12. © Harbinger Systems | How to Categorize the Problem? Generally, machine learning problems looks to: Identify a Value Assign data points to a category Discover similarities between two data points
  13. 13. © Harbinger Systems | Flowchart Start Sufficient Data? Sort into category? Predict a value? Refine Problem! Labeled Data Clustering Classification Get more! Regression
  14. 14. © Harbinger Systems | What to look for in algorithms: Flexible across many use cases Able to handle several input types Accurate Resistant to over-fitting/noise/error Machine Learning Algorithms
  15. 15. © Harbinger Systems | Random Forest Used for classification and regression Works on small subsets of data and combines the result into the best estimate XGBoost Works on classification and regression Starts off with a weak learner that improves over successive iterations K-Means Works on classification and clustering Tries to find boundaries between data points for each individual variable Machine Learning Algorithms
  16. 16. © Harbinger Systems | Tools and Technologies Emphasis on tools which: Can integrate with existing data architecture Have a smooth learning curve Simplify the process of analysis and prediction Have an active community
  17. 17. © Harbinger Systems | Popular Machine Learning Tools Python Free, open-source, widely popular Consolidates many important libraries in python, C Has an active community Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
  18. 18. © Harbinger Systems | Popular Machine Learning Tools R Statistical computing language that simplifies complex statistical operations Large number of libraries available for extending functionality (DB connectors, algorithm, visualization) Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
  19. 19. Scenario Industrial MNC buys part assemblies from various suppliers Supplier selection workflow is cumbersome and inadaptable Create a system to predict supplier price quotes and simplify selection process © Harbinger Systems | Price Prediction: Regression Problem
  20. 20. Data Available Technical specifications and pricing parameters of past supplier quotes © Harbinger Systems | Problem Type Predict a numerical value (price quoted by supplier) Numerical data (specs, prices, etc.) Categorical data (part composition, payment options, etc.) Price Prediction: Regression Problem
  21. 21. Algorithm Chosen: Random Forest We are working with a mix of numerical and categorical variables Large number of records but with relatively low dimensionality of features (Overfitting is not a big risk) We expect a complex relationship between features © Harbinger Systems | Price Prediction: Regression Problem
  22. 22. Result Predicted the quote price given by the supplier with a relatively low error rate Simplified supplier selection workflow and opened avenues for complete automation in future © Harbinger Systems | Price Prediction: Regression Problem
  23. 23. Scenario eLearning product is sold to universities, corporate and institution across the globe There is a need to improve conversion rate by targeted marketing Create a system to sort prospects into a specific segment © Harbinger Systems | Targeted Marketing: Classification Problem
  24. 24. Data Available Historical data of purchases and customer data from CRM © Harbinger Systems | Problem Type: Predict a Category (Customer Segment) Based on Numerical data (payment records) Categorical data (customer profile data, product purchased by data) Targeted Marketing: Classification Problem
  25. 25. Algorithm Chosen: Gradient Boosting Machine (XGBoost) A mix of numerical and categorical values Extremely high dimensionality and size of data Parallel processing capacities could be useful Overfitting could be a problem © Harbinger Systems | Targeted Marketing: Classification Problem
  26. 26. Result Created customer segments; new prospects entering CRM are sorted into a segment and marketing campaigns are targeted to a particular segment Sales people are better equipped with insights © Harbinger Systems | Targeted Marketing: Classification Problem
  27. 27. Scenario News feed engine publishes varied news content for users Some level of categorization is done by humans There is a need to personalize and recommend articles Create a system to discover similar articles based on content © Harbinger Systems | Personalized News Feed: Clustering Problem
  28. 28. Data Available Text content of the news articles User’s reading history © Harbinger Systems | Personalized News Feed: Clustering Problem
  29. 29. Algorithm Chosen: K-Means Clustering We are interested in sorting data points in an arbitrary series of clusters No intrinsic metric for verifying the 'correctness' of a cluster, must be checked by human oversight We expect sorting to be accurate with more data © Harbinger Systems | Personalized News Feed: Clustering Problem
  30. 30. Result Sorted articles into different clusters which are nominally identified by a label © Harbinger Systems | Personalized News Feed: Clustering Problem
  31. 31. Conclusion • Amount of data available to enterprises is exploding • In order to remain competitive, enterprises will have to have mastery over their data • Machine learning provides a powerful framework for extracting meaning and actions from data © Harbinger Systems |
  32. 32. Q&A © Harbinger Systems |
  33. 33. © Harbinger Systems | Thank You! Visit us at: Write to us at: Blog: Twitter: (@HarbingerSys) Slideshare: Facebook: LinkedIn: Instagram: