Successfully reported this slideshow.
Your SlideShare is downloading. ×

Introduction to Machine Learning - WeCloudData

Ad

Introduction to
Machine Learning
WeCloudData
@WeCloudData @WeCloudData tordatascience
weclouddata
WeCloudData tordatascien...

Ad

Career
Services
Meetup
Events
Introduction
Data Skills
Training
WeCloudData offers Toronto’s first data
science accelerato...

Ad

WCD works with some of the most
talented and experienced data
science experts to deliver public
and corporate trainings. W...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 68 Ad
1 of 68 Ad

Introduction to Machine Learning - WeCloudData

In this talk, WeCloudData introduces the lifecycle of machine learning and its tools/ecosystems. For more detail about WeCloudData's machine learning course please visit: https://weclouddata.com/data-science/

In this talk, WeCloudData introduces the lifecycle of machine learning and its tools/ecosystems. For more detail about WeCloudData's machine learning course please visit: https://weclouddata.com/data-science/

More Related Content

Slideshows for you (19)

Similar to Introduction to Machine Learning - WeCloudData (20)

Introduction to Machine Learning - WeCloudData

  1. 1. Introduction to Machine Learning WeCloudData @WeCloudData @WeCloudData tordatascience weclouddata WeCloudData tordatascience
  2. 2. Career Services Meetup Events Introduction Data Skills Training WeCloudData offers Toronto’s first data science accelerator program. We specialize in teaching lead-edge tools such as AWS, Spark, and Machine Learning and help our corporate clients upskill/reskill their data teams
  3. 3. WCD works with some of the most talented and experienced data science experts to deliver public and corporate trainings. We currently have 21 part-time and 2 full-time instructors. Our instructors bring their analytical expertise from various industries, teach students advanced tools such as Python, Hadoop, Spark, and AWS, mentor students on end- to-end data projects. Introduction Faculty Team 21 Instructors 10 Teaching Assistants
  4. 4. Python for SAS and SQL Users Machine Learning | Deep Learning Big Data Executive Workshops Product & Services Corporate Training We offer customized corporate training to Canadian companies with flexible schedules and learning support! We help train, upskill, and reskill data teams!
  5. 5. Python for SAS Users Machine Learning Big Data AI/DS for Executives Corporate Data Programs We’ve delivered customized trainings to many large Canadian companies WeCloudData Corporate Program We offer customized corporate training to Canadian companies with flexible schedules and learning support! We help train, upskill, and reskill data teams!
  6. 6. Introduction Communities we’re building 8,000 members 120 events We organize one of the most active DS communities in Canada!
  7. 7. Upcoming Events Schedule Track Meetup Org Topic Date Data Science WCD Introduction to Machine Learning May 29 Big Data WCD Big Data for Data Scientist – Open Class Jun 4 Big Data WCD Spark on Kubernetes Jun 5 Big Data Lightbend Kafka in Jail with Strimzi Jun 11 Cloud Big Data & AI Conference Machine Learning from Experimentation to Production on AWS Jun 12 Big Data Big Data & AI Conference Transforming big data from On-premise to the Cloud Jun 12 Data Science Big Data & AI Conference Spark for Data Science Jun 13 Data Science Big Data & AI Conference Moving Towards a Python Environment Jun 13 tordatascience
  8. 8. Workshop Provider Conference/Clients Workshop Provider TMLS Conference November, 2018 Workshop Provider TD Canada Analytics Month October, 2018 • Machine Learning Open Data • Spark ML and MLflow • Deep Learning with PyTorch • Python for SAS Users • Machine Learning with Python Workshop Provider Big Data & AI Toronto 2019 June, 2019 • Big Data in AWS Cloud • Spark for Data Science • Moving from On-Prem to Cloud WeCloudData is the conference workshop choice of vendors in Toronto due to our expertise and specialty.
  9. 9. Analytics Events We help companies with hiring/branding events WeCloudData organizes one of the largest and most active data science communities in Toronto with 7,500 members and 110 past events. We help companies facilitate mini-conferences and help them run hiring events.
  10. 10. 2005 2007 2008 2010 2011 2015 2012 2014 2016 2018 Instructor Shaohua Zhang • Co-founder and CEO of WeCloudData. Lead instructor for the corporate training program • Certified SAS Predictive Modeler since 2007 (among the first 20 in the world) • Helped build and lead the data science team at BlackBerry (2010 – 2015) • Helping Communitech incubator and Open Data Exchange mentor startups on data strategies • Specializes in machine learning, big data, and cloud computing
  11. 11. Learning Path Data Science Program Prerequisites Data Science Learning Path Learn to build ML models using Sklearn ML Applied Master data wrangling with Python Data Science w/ Python Harness big data with Hadoop, Hive, Presto, and AtScale Big Data Build your portfolio with hands-on Capstone projects ML Advanced Machine Learning at Scale with PySpark ML and Real-time Deployment Spark Contact us about the courses: • info@weclouddata.com Upcoming courses: • https://weclouddata.com/upcoming-course-schedule
  12. 12. DS Trainer
  13. 13. Data Scientist
  14. 14. Data Jobs in the MarketData Handling Complex Analytics Big Data Storytelling Data Science Data Scientist
  15. 15. Coding/Tools Math/ML Storytelling Data Scientist Linux Python/Scala/Java Cloud (AWS) Hadoop, Spark Statistics Linear Algebra Regression Classification Clustering NLP Presentation Use cases Project Mgmt Communications Data Science Essential Skills
  16. 16. Data Application Scraping/API Labeled data Infra/ Platform RDBMS Hadoop Cloud Data Engineering ETL Enrichment Dataflow automation AI/ML Python ML Deployment Prediction API Stream processing Data Science The Myth
  17. 17. Data Scientist The Types Operational DS Focus: data wrangling, work with large/small messy data, builds predictive models Strength: data handling, tools, business knowledge ML Engineer Focus: ML model deployment, data pipelines Strength: coding, algorithms, machine learning, platforms and tools ML Researcher Focus: algorithm development, research, IP Strength: ML/DL algorithms, implmentation, research DS Product Mngr Focus: product strategy, business communications, project management Strength: product sense, business requirements, DS acumen
  18. 18. Predictive Modeler GrowthAcquisition Maturity Decline Loss ● Lead Gen ● Digital Mktg ● Mobile Ads ● Cross/Up-sell ● Segmentation ● CLTV ● Taste graph ● Personalization ● Loyalty Management ● Context-based Mkgt ● Churn models ● Retention Acquisition Models LTV Loyalty Management Retention Winback Customer Value ● Winback models Predict high risk customers
  19. 19. Data Scientist
  20. 20. Data Scientist
  21. 21. Twitter API Data Scientist Business Our new product feature received a lot of negative review.. - Can we do some analysis?
  22. 22. Introduction to Machine Learning
  23. 23. Machine Learning Height (in) Weight(lbs) Humans
  24. 24. Machine Learning download speed (Mb/s) uploadspeed(Mb/s) Internet Providers
  25. 25. Machine Learning Sepal Length PetalLength Iris Flowers
  26. 26. MACHINE LEARNING CONTINUOUS CATEGORICAL SUPERVISED REGRESSION CLASSIFICATION UNSUPERVISED DIMENSION REDUCTION CLUSTERING Types of Machine Learning Algorithms
  27. 27. Machine Learning Tools
  28. 28. Machine Learning Lifecycle
  29. 29. ML Lifecycle
  30. 30. ML Lifecycle
  31. 31. Data Acquisition & Preparation
  32. 32. Data Acquisition • Behavioral data • Scraped data • 3rd party data • Labeled data
  33. 33. Modeling Dataset Credit Approval Age Gender Annual Salary Months in Residence Months in Job Current Debt Paid off Credit Client 1 23 M $30,000 36 12 $5,000 Yes Client 2 30 F $45,000 12 12 $1,000 Yes Client 3 19 M $15,000 3 1 $10,000 No Client 4 25 M $25,000 12 27 $15,000 ? Features (Predictors | Input Variables) Labels (Target) ID (Index)
  34. 34. Data Preparation
  35. 35. Data Exploration
  36. 36. Exploratory Data Analysis
  37. 37. Exploratory Data Analysis
  38. 38. Train/Test Split
  39. 39. Train/Validation/Test Split
  40. 40. Train Validation Testing Dataset Holdout Approach
  41. 41. Dataset Cross-validation Approach TestingTrain Cross- validation
  42. 42. Feature Processing (Engineering)
  43. 43. Feature Preprocessing ● Derived features ● Feature scaling ● Variable binning ● One-hot encoding ● Weight of evidence ● TF-IDF ● etc.
  44. 44. Model Fitting
  45. 45. Decision Tree
  46. 46. Overfitting/Underfitting
  47. 47. Model Evaluation & Selection
  48. 48. Evaluation Metrics
  49. 49. Parameter Tuning
  50. 50. Cross-validation Dataset TestingTrain Performance 1 Performance 2 Performance 3 Performance 4 Performance 5 Performance avg
  51. 51. Train SVM Dataset Train Test Train RandomForest Test Test Cross-validation Winning Model Winning Parameter Set
  52. 52. Model Interpretation
  53. 53. Linear Classification Model Interpretation – Logistic Regression
  54. 54. Non-linear Classification Model Interpretation – Decision Tree
  55. 55. Complex Model Interpretation – Surrogate Model
  56. 56. Complex Model Interpretation – Feature Importance Feature Importance plots are quite common for explaining the models. But it’s not ideal. For instance, it doesn’t get any indication of the direction of the relationship, whether it’s linear or non-linear.
  57. 57. Complex Model Interpretation – Feature Importance Feature Importance plots are quite common for explaining the models. But it’s not ideal. For instance, it doesn’t get any indication of the direction of the relationship, whether it’s linear or non-linear.
  58. 58. Complex Model Interpretation – LIME Lime is short for Local Interpretable Model-Agnostic Explanations. Each part of the name reflects something that we desire in explanations. Local refers to local fidelity - i.e., we want the explanation to really reflect the behaviour of the classifier "around" the instance being predicted.This explanation is useless unless it is interpretable - that is, unless a human can make sense of it. Lime is able to explain any model without needing to 'peak' into it, so it is model-agnostic. All previously mentioned methods can give an idea about the global behavior of the model. They fail to tell why a particular instance is classified one way or the other. 1. Perturb the observation 2. Calculate distance between permuted data and original observations 3. Make predictions on the permuted data using complex model 4. Pick m features best describing the complex model outcome from the permuted data 5. Fit a simple model to the permuted data with m features and similarity scores as weights 6. Feature weights from the simple model make explanations for the complex models local behavior
  59. 59. Applied Machine Learning Course Detail
  60. 60. Syllabus Applied Machine Learning Syllabus (Weekend Cohort – 12 sessions/48 hours) Lecture Content Lecture Content 1 Introduction • Introduction to Machine Learning • Gradient Descent 7 Advanced Ensembles • Xgboost • Stacking 2 Regularization • Regularization • Lasso/Ridge/ElasticNet 8 Model Interpretation • Factorization Machines • Complex Model Interpretation 3 Logistic Regression • Logistic Regression • Multi-class Classification • Evaluation Metrics • Variance/Bias Tradeoff 9 Unsupervised • K-Means Clustering • Dimension Reduction • PCA 4 Feature Engineering • Numerical Features • Categorical Features • Text Features 10 Neural Networks I • Neural Networks • Backpropagation 5 Non-parametric Models • KNN • Decision Trees • Project kick-off 11 Recommendation Engines • Market Basket Analysis • Collaborative Filtering • Matrix Factorization 6 Parameter Tunings • Ensemble Methods • Bagging • Boosting • Hyper-parameter Tunings 12 Model Deployment • Machine Learning Lifecycle • Model Deployment • Project Presentation
  61. 61. Applied Machine Learning Instructor – Jodie Zhu • Machine Learning Engineer at Dessa • University of Toronto, Master of Science (Biostatistics) • Python Instructor at WeCloudData • Career development mentor • Expertise: Python | Data Science | Deep Learning Machine Learning Engineer Dessa
  62. 62. Python Programming Instructor – Holly Xie • Machine Learning Scientist at integrate.ai • University of Waterloo, Master of Mathematics • Machine Learning Instructor at WeCloudData • Expertise: Machine Learning| Deep Learning Machine Learning Scientist Integrate.ai
  63. 63. Applied Machine Learning Hands-on Project This course is instructor-led and project-based. Students will be able to apply the Machine Learning knowledge acquired in the course to a hands-on project. Project: • The instructor will work with the students to decide the project topics. It is highly recommended that the students bring their own motivation and ideas. Otherwise, a topic along with datasets will be assigned to the students • The student is also encouraged to apply the learnings directly to his/her company’s data problems and receive technical advice from the instructor
  64. 64. Applied Machine Learning Interview Practice For job seekers, this course also provides supplementary materials to help you prepare for data science interviews Interview Help • Common ML interview questions • Mock interview quiz
  65. 65. Applied Machine Learning Price Course Pricing Applied Machine Learning $2000 + tax
  66. 66. Upcoming WeCloud Events Event Schedules
  67. 67. Upcoming Events Schedule Track Meetup Org Topic Date Data Science WCD Introduction to Machine Learning May 29 Big Data WCD Big Data for Data Scientist – Open Class Jun 4 Big Data WCD Spark on Kubernetes Jun 5 Big Data Lightbend Kafka in Jail with Strimzi Jun 11 Cloud Big Data & AI Conference Machine Learning from Experimentation to Production on AWS Jun 12 Big Data Big Data & AI Conference Transforming big data from On-premise to the Cloud Jun 12 Data Science Big Data & AI Conference Spark for Data Science Jun 13 Data Science Big Data & AI Conference Moving Towards a Python Environment Jun 13 tordatascience
  68. 68. TYPE OF DATA JOB SEEKERS

×