SlideShare a Scribd company logo
1 of 41
globalaihub.com
Introduction to Machine
Learning
globalaihub.com
Before Starting the Course
● Upon completion of this course, you will have acquired general knowledge of introductory artificial
intelligence algorithms and data analysis.
● The person receiving the course needs to practice a lot in order to learn the subject better after the
lesson.
● Slides will mostly be explained with images rather than text. That is why it is extremely important
to take notes during the lesson.
● Since the titles are sufficient for basic algorithms, the titles should be researched, lots of
applications should be coded and research should be done on sites containing theoretical
information.
globalaihub.com
What You Will Learn In This Course
1. What is Machine Learning?
2. What innovations has Machine Learning brought to our lives?
3. What are the requirements and additional capabilities for Machine Learning?
4. What is Data Science?
5. What is data cleaning, provisioning and attribute engineering?
6. What are the steps of a Machine Learning project?
7. What are the tools used in Machine Learning?
8. What are useful resources for Machine Learning?
9. Hands-on Training: What does a machine learning project look like?
globalaihub.com
Mert Çobanov
Github: cobanov
Linkedin: mertcobanoglu
Twitter: mertcobanov
globalaihub.com
Kaynaklar
Deep Learning Türkiye Hands-on Machine Learning
with Scikit-Learn, Keras &
TensorFlow
Aurelien Geron
Stanford University
stanford.edu/~shervine/
Shervine Amidi
globalaihub.com
What is Machine Learning?
Machine Learning is the science (and art) of programming computers so that they can learn from
data.
Machine Learning is the field of
study that gives computers the
ability to learn without being
explicitly programmed.
—Arthur Samuel, 1959
globalaihub.com
Machine Learning Disambiguation
globalaihub.com
Types of Machine Learning
globalaihub.com
Machine Learning Comparison
globalaihub.com
Machine Learning Algorithms
globalaihub.com
Machine Learning Applications
● Segment customers based on their purchases so you can design a different marketing strategy
for each segment
● Suggesting a product that a customer might be interested in based on past purchases
● Analyzing images of products on a production line to automatically classify
● Automatically categorize the news
● Automatically flag offensive comments on discussion forums
● Create a chatbot or personal assistant
● Predict your company's revenue for the next year based on many performance metrics
● Making your app respond to voice commands
● Creating a smart bot for a game
globalaihub.com
Mathematics in Machine
Learning
globalaihub.com
Linear Algebra
globalaihub.com
Probability and Statistics
Distributions Descriptive
Statistics and
Statistics Tests
Model Performance Measurements
globalaihub.com
Data Science
globalaihub.com
Working with Data
Data science is a multidisciplinary field that uses
scientific methods, processes, algorithms and systems to
extract information and insights from structured and
unstructured data.
Data science is a concept used to combine statistics,
data analysis, machine learning and related methods to
understand and analyze real events with data. It uses
many techniques and theories from fields such as
mathematics, statistics, computer science.
globalaihub.com
Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data.
Features are used by predictive models and affect results.
globalaihub.com
Model Training
globalaihub.com
Machine Learning Project Steps
1. Seeing the big picture and
understanding the project
2. Collect data
3. Examine and visualize data
4. Fitting Data to Machine Learning
Models
5. Model selection and training of the
model
6. Optimizing the model
7. Integrating the model into the system
globalaihub.com
Big picture and understanding the project
1. Define the goal in terms of business
2. What are the current solutions/workarounds?
3. How should you evaluate this issue
(supervised/unsupervised, online/offline etc.)?
4. How should performance be measured?
5. What would be the minimum performance required to
meet the business goal?
6. Is human expertise available?
7. How do you fix the problem manually?
8. List the assumptions you (or others) have made so far
globalaihub.com
Collect Data
Prepare a Working Environment
Data should be stored neatly and appropriately in machine learning projects.
In particular, the raw data should not be damaged and their structures should
not be damaged. The data to be used and pre processed in the model should be
stored and the data sent to the model should be stored. Therefore, appropriate
databases should be established or folder hierarchies should be provided.
Get the Data
There are many sources that can obtain data, data sets obtained from the
internet can be used during learning stages, working with real-world data sets
is a bit more difficult, sometimes this data is not easy to obtain, but these data
can be collected over the internet with software.
globalaihub.com
Examine and Visualize Data
Exploratory Data Analysis (EDA)
It is the work that allows us to quickly recognize the data, given to creating simple graphs (eg box
charts, scatter plots) that help to draw a picture of a data set, along with summary statistics (mean,
median, quantities, etc.).
globalaihub.com
Fitting the Data to the Model
1. Cleaning and Editing Data
○ Cleaning up Unnecessary or Non-Informational Data
○ Fixing wrong data
○ Combining data from different sources
2. Determining Data Types
○ Date, Numeric, Text etc. checking data in formats
○ Performing appropriate data type conversions
3. Data Size Reduction
○ PCA
○ Elimination of redundant columns and correlation
analysis
4. Examination of data distributions and regularization
○ Min-max Scaling
○ Standardization
globalaihub.com
Model selection and training of the model
globalaihub.com
Success Performances: Classification Metrics
Complexity Matrix
globalaihub.com
Success Performances: Regression Metrics
globalaihub.com
Optimizing the model
● Fine tune hyperparameters using cross validation
● Perform hyperparameter searches, gridsearch etc.
● Try Ensemble methods. Combining your best models
often performs better than running them individually
● You'll want to use as much data as possible for this
step, especially as you move towards the end of the
tweak.
● Once you are sure of your final model, measure its
performance on the test set to estimate its
generalization error.
globalaihub.com
Integrating the model into the system(Deployment)
ML Deployment is the integration of a data-driven machine
learning model into an existing production environment.
The machine learning project developed in the test
environment is made ready to work on platforms such as web
services (SaaS, PaaS) in order to serve the end user.
● Streamlit, HTML or CSS etc. It is an open source
python library that helps you create interactive web
applications without knowledge of it.
● Docker is by far one of the most popular ways for
developers to containerize their code.
● Heroku is a cloud computing application infrastructure
service provider (PaaS)
globalaihub.com
Machine Learning
Terminologies
globalaihub.com
Cross Validation
Cross validation, also called CV, is a method used to select a model that does not rely heavily on the
initial training set.
globalaihub.com
Bias - Variance Tradeoff
Bias: The bias of a model is the difference between the expected prediction and the correct model we
are trying to predict for the given data points.
Varyans: The variance of a model is the variability of the model estimate for given data points.
Deviation/variance Tradeoff: The simpler the model, the higher the bias, and the more complex the
model, the higher the variance.
Underfitting Ideal Overfitting
Symptoms ● Higher training error
● Training error close to
test error
● High bias
● Training error
slightly lower than
test error
● Very low training error
● Training error is considerably
lower than test error
● High variance
globalaihub.com
Early Stopping
globalaihub.com
Tools Used in Machine
Learning
globalaihub.com
Tools Used in Machine Learning
Python is a general purpose programming language.
As an interpreted and dynamic language, Python
mainly supports object-oriented programming
approaches and functional programming.
• Rapid prototyping
• Basic Syntax
• Easy to use
• Large Community
globalaihub.com
Tools Used in Machine Learning
NumPy is the basic package used for scientific
calculations in Python.
• Creating an array
• Vectorization and slicing
• Matrices and simple linear algebra
• Data files
globalaihub.com
Tools Used in Machine Learning
Pandas is an open source Python library that facilitates
data analysis and data preprocessing.
• Useful functions for data manipulation
• Tools to read and write data between different
formats: CSV and text files, Microsoft Excel, SQL
databases
• Fast data visualization at a simple level
globalaihub.com
Tools Used in Machine Learning
Matplotlib is a data visualization and plotting
library for the Python programming language
• The matplotlib plotting package is one of the
most important tools for scientific
programming with Python
• Matplotlib is a very powerful library. Can
visualize data interactively
• We can produce high quality outputs suitable
for printing and publication.
• Both two-dimensional and three-dimensional
graphics can be produced
globalaihub.com
Tools Used in Machine Learning
Scikit-learn is a free software machine learning
library for the Python programming language.
It includes many basic methods such as linear
regression, logistic regression, decision trees,
random forest.
https://scikit-learn.org/stable/
globalaihub.com
Datasets
globalaihub.com
Kaggle
Kaggle is an online community for data scientists and machine learning practitioners.
It is a platform where owners of large or small problems express their data and problems in order to
solve the relevant problem, and the participants participate in competitions to solve the problem
within the information given.
• Hundreds of datasets
• Prize competitions
• Education and guides
globalaihub.com
UCI
UCI is the dataset repository provided by the
machine learning and intelligent systems research
center at the University of California, Irvine.
It currently hosts 588 datasets as a service to the
machine learning community.
https://archive.ics.uci.edu/ml/index.php

More Related Content

Similar to Machine Learning

OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
Paco Nathan
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
David Walker, CSM,CSD,MCP,MCAD,MCSD,MVP
 

Similar to Machine Learning (20)

Toolboxes for data scientists
Toolboxes for data scientistsToolboxes for data scientists
Toolboxes for data scientists
 
Week 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud ComputingWeek 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud Computing
 
Python ml
Python mlPython ml
Python ml
 
OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Introduction to ML.NET
Introduction to ML.NETIntroduction to ML.NET
Introduction to ML.NET
 
Python and data analytics
Python and data analyticsPython and data analytics
Python and data analytics
 
Unit no_1.pptx
Unit no_1.pptxUnit no_1.pptx
Unit no_1.pptx
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
antrikshindutrialmachinelearningPPT.pptx
antrikshindutrialmachinelearningPPT.pptxantrikshindutrialmachinelearningPPT.pptx
antrikshindutrialmachinelearningPPT.pptx
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the Switch
 

More from Ramiro Aduviri Velasco

Overview of Artificial Neural Networks and its Applications
Overview of Artificial Neural Networks and its ApplicationsOverview of Artificial Neural Networks and its Applications
Overview of Artificial Neural Networks and its Applications
Ramiro Aduviri Velasco
 
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdfLA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
Ramiro Aduviri Velasco
 
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y PythonMentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
Ramiro Aduviri Velasco
 
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en EducaciónAlfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
Ramiro Aduviri Velasco
 

More from Ramiro Aduviri Velasco (20)

Instituto de Inteligencia Artificial Aplicada Cursos 2024
Instituto de Inteligencia Artificial Aplicada Cursos 2024Instituto de Inteligencia Artificial Aplicada Cursos 2024
Instituto de Inteligencia Artificial Aplicada Cursos 2024
 
Overview of Artificial Neural Networks and its Applications
Overview of Artificial Neural Networks and its ApplicationsOverview of Artificial Neural Networks and its Applications
Overview of Artificial Neural Networks and its Applications
 
Regresión
RegresiónRegresión
Regresión
 
Ingenieria y Arte Prompts.pdf
Ingenieria y Arte Prompts.pdfIngenieria y Arte Prompts.pdf
Ingenieria y Arte Prompts.pdf
 
Diplomado IA Innovación Profesional
Diplomado IA Innovación ProfesionalDiplomado IA Innovación Profesional
Diplomado IA Innovación Profesional
 
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdfLA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
LA DOCENCIA UNIVERSITARIA ENFOQUE AULA INVERTIDA Medina.pdf
 
ChatGPT e IA en Educación
ChatGPT e IA en EducaciónChatGPT e IA en Educación
ChatGPT e IA en Educación
 
ChatGPT e Inteligencia Artificial
ChatGPT e Inteligencia ArtificialChatGPT e Inteligencia Artificial
ChatGPT e Inteligencia Artificial
 
Mentoría en Robótica Educativa
Mentoría en Robótica EducativaMentoría en Robótica Educativa
Mentoría en Robótica Educativa
 
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y PythonMentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
Mentoría en Inteligencia Artificial con asistencia de ChatGPT y Python
 
Mentoría en Matemáticas con ChatGPT y Python.docx
Mentoría en Matemáticas con ChatGPT y Python.docxMentoría en Matemáticas con ChatGPT y Python.docx
Mentoría en Matemáticas con ChatGPT y Python.docx
 
Guía rápida chat GPT
Guía rápida chat GPTGuía rápida chat GPT
Guía rápida chat GPT
 
Practicando ChatGPT
Practicando ChatGPT Practicando ChatGPT
Practicando ChatGPT
 
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en EducaciónAlfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
Alfabetizaciones Pendientes: ChatGPT e Inteligencia Artificial (IA) en Educación
 
Programación en Python.docx
Programación en Python.docxProgramación en Python.docx
Programación en Python.docx
 
Contenidos Interactivos Personalizados con ChatGPT.docx
Contenidos Interactivos Personalizados con ChatGPT.docxContenidos Interactivos Personalizados con ChatGPT.docx
Contenidos Interactivos Personalizados con ChatGPT.docx
 
ChatGPT y prompts educativos.docx
ChatGPT y prompts educativos.docxChatGPT y prompts educativos.docx
ChatGPT y prompts educativos.docx
 
Introducción a la Inteligencia Artificial Generativa.docx
Introducción a la Inteligencia Artificial Generativa.docxIntroducción a la Inteligencia Artificial Generativa.docx
Introducción a la Inteligencia Artificial Generativa.docx
 
analitica datos con chatGPT y Python.docx
analitica datos con chatGPT y Python.docxanalitica datos con chatGPT y Python.docx
analitica datos con chatGPT y Python.docx
 
aprendizaje automatico con chatGPT y Python.docx
aprendizaje automatico con chatGPT y Python.docxaprendizaje automatico con chatGPT y Python.docx
aprendizaje automatico con chatGPT y Python.docx
 

Recently uploaded

Recently uploaded (20)

OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Machine Learning

  • 2. globalaihub.com Before Starting the Course ● Upon completion of this course, you will have acquired general knowledge of introductory artificial intelligence algorithms and data analysis. ● The person receiving the course needs to practice a lot in order to learn the subject better after the lesson. ● Slides will mostly be explained with images rather than text. That is why it is extremely important to take notes during the lesson. ● Since the titles are sufficient for basic algorithms, the titles should be researched, lots of applications should be coded and research should be done on sites containing theoretical information.
  • 3. globalaihub.com What You Will Learn In This Course 1. What is Machine Learning? 2. What innovations has Machine Learning brought to our lives? 3. What are the requirements and additional capabilities for Machine Learning? 4. What is Data Science? 5. What is data cleaning, provisioning and attribute engineering? 6. What are the steps of a Machine Learning project? 7. What are the tools used in Machine Learning? 8. What are useful resources for Machine Learning? 9. Hands-on Training: What does a machine learning project look like?
  • 4. globalaihub.com Mert Çobanov Github: cobanov Linkedin: mertcobanoglu Twitter: mertcobanov
  • 5. globalaihub.com Kaynaklar Deep Learning Türkiye Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow Aurelien Geron Stanford University stanford.edu/~shervine/ Shervine Amidi
  • 6. globalaihub.com What is Machine Learning? Machine Learning is the science (and art) of programming computers so that they can learn from data. Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. —Arthur Samuel, 1959
  • 11. globalaihub.com Machine Learning Applications ● Segment customers based on their purchases so you can design a different marketing strategy for each segment ● Suggesting a product that a customer might be interested in based on past purchases ● Analyzing images of products on a production line to automatically classify ● Automatically categorize the news ● Automatically flag offensive comments on discussion forums ● Create a chatbot or personal assistant ● Predict your company's revenue for the next year based on many performance metrics ● Making your app respond to voice commands ● Creating a smart bot for a game
  • 14. globalaihub.com Probability and Statistics Distributions Descriptive Statistics and Statistics Tests Model Performance Measurements
  • 16. globalaihub.com Working with Data Data science is a multidisciplinary field that uses scientific methods, processes, algorithms and systems to extract information and insights from structured and unstructured data. Data science is a concept used to combine statistics, data analysis, machine learning and related methods to understand and analyze real events with data. It uses many techniques and theories from fields such as mathematics, statistics, computer science.
  • 17. globalaihub.com Feature Engineering Feature engineering is the process of using domain knowledge to extract features from raw data. Features are used by predictive models and affect results.
  • 19. globalaihub.com Machine Learning Project Steps 1. Seeing the big picture and understanding the project 2. Collect data 3. Examine and visualize data 4. Fitting Data to Machine Learning Models 5. Model selection and training of the model 6. Optimizing the model 7. Integrating the model into the system
  • 20. globalaihub.com Big picture and understanding the project 1. Define the goal in terms of business 2. What are the current solutions/workarounds? 3. How should you evaluate this issue (supervised/unsupervised, online/offline etc.)? 4. How should performance be measured? 5. What would be the minimum performance required to meet the business goal? 6. Is human expertise available? 7. How do you fix the problem manually? 8. List the assumptions you (or others) have made so far
  • 21. globalaihub.com Collect Data Prepare a Working Environment Data should be stored neatly and appropriately in machine learning projects. In particular, the raw data should not be damaged and their structures should not be damaged. The data to be used and pre processed in the model should be stored and the data sent to the model should be stored. Therefore, appropriate databases should be established or folder hierarchies should be provided. Get the Data There are many sources that can obtain data, data sets obtained from the internet can be used during learning stages, working with real-world data sets is a bit more difficult, sometimes this data is not easy to obtain, but these data can be collected over the internet with software.
  • 22. globalaihub.com Examine and Visualize Data Exploratory Data Analysis (EDA) It is the work that allows us to quickly recognize the data, given to creating simple graphs (eg box charts, scatter plots) that help to draw a picture of a data set, along with summary statistics (mean, median, quantities, etc.).
  • 23. globalaihub.com Fitting the Data to the Model 1. Cleaning and Editing Data ○ Cleaning up Unnecessary or Non-Informational Data ○ Fixing wrong data ○ Combining data from different sources 2. Determining Data Types ○ Date, Numeric, Text etc. checking data in formats ○ Performing appropriate data type conversions 3. Data Size Reduction ○ PCA ○ Elimination of redundant columns and correlation analysis 4. Examination of data distributions and regularization ○ Min-max Scaling ○ Standardization
  • 24. globalaihub.com Model selection and training of the model
  • 27. globalaihub.com Optimizing the model ● Fine tune hyperparameters using cross validation ● Perform hyperparameter searches, gridsearch etc. ● Try Ensemble methods. Combining your best models often performs better than running them individually ● You'll want to use as much data as possible for this step, especially as you move towards the end of the tweak. ● Once you are sure of your final model, measure its performance on the test set to estimate its generalization error.
  • 28. globalaihub.com Integrating the model into the system(Deployment) ML Deployment is the integration of a data-driven machine learning model into an existing production environment. The machine learning project developed in the test environment is made ready to work on platforms such as web services (SaaS, PaaS) in order to serve the end user. ● Streamlit, HTML or CSS etc. It is an open source python library that helps you create interactive web applications without knowledge of it. ● Docker is by far one of the most popular ways for developers to containerize their code. ● Heroku is a cloud computing application infrastructure service provider (PaaS)
  • 30. globalaihub.com Cross Validation Cross validation, also called CV, is a method used to select a model that does not rely heavily on the initial training set.
  • 31. globalaihub.com Bias - Variance Tradeoff Bias: The bias of a model is the difference between the expected prediction and the correct model we are trying to predict for the given data points. Varyans: The variance of a model is the variability of the model estimate for given data points. Deviation/variance Tradeoff: The simpler the model, the higher the bias, and the more complex the model, the higher the variance. Underfitting Ideal Overfitting Symptoms ● Higher training error ● Training error close to test error ● High bias ● Training error slightly lower than test error ● Very low training error ● Training error is considerably lower than test error ● High variance
  • 33. globalaihub.com Tools Used in Machine Learning
  • 34. globalaihub.com Tools Used in Machine Learning Python is a general purpose programming language. As an interpreted and dynamic language, Python mainly supports object-oriented programming approaches and functional programming. • Rapid prototyping • Basic Syntax • Easy to use • Large Community
  • 35. globalaihub.com Tools Used in Machine Learning NumPy is the basic package used for scientific calculations in Python. • Creating an array • Vectorization and slicing • Matrices and simple linear algebra • Data files
  • 36. globalaihub.com Tools Used in Machine Learning Pandas is an open source Python library that facilitates data analysis and data preprocessing. • Useful functions for data manipulation • Tools to read and write data between different formats: CSV and text files, Microsoft Excel, SQL databases • Fast data visualization at a simple level
  • 37. globalaihub.com Tools Used in Machine Learning Matplotlib is a data visualization and plotting library for the Python programming language • The matplotlib plotting package is one of the most important tools for scientific programming with Python • Matplotlib is a very powerful library. Can visualize data interactively • We can produce high quality outputs suitable for printing and publication. • Both two-dimensional and three-dimensional graphics can be produced
  • 38. globalaihub.com Tools Used in Machine Learning Scikit-learn is a free software machine learning library for the Python programming language. It includes many basic methods such as linear regression, logistic regression, decision trees, random forest. https://scikit-learn.org/stable/
  • 40. globalaihub.com Kaggle Kaggle is an online community for data scientists and machine learning practitioners. It is a platform where owners of large or small problems express their data and problems in order to solve the relevant problem, and the participants participate in competitions to solve the problem within the information given. • Hundreds of datasets • Prize competitions • Education and guides
  • 41. globalaihub.com UCI UCI is the dataset repository provided by the machine learning and intelligent systems research center at the University of California, Irvine. It currently hosts 588 datasets as a service to the machine learning community. https://archive.ics.uci.edu/ml/index.php