1. Sejong University
Week 1: Introduction Applied Machine Learning
Applied Machine Learning
Spring 2023
Prof. Rizwan Ali Naqvi
School of Intelligent Mechatronics Engineering,
Sejong University, Republic of Korea.
Lecture 1:
Introduction to Machine Learning
2. Sejong University
Week 1: Introduction Applied Machine Learning
My Introduction! Rizwan Ali Naqvi
Educational Qualification:
B.S. from Comsats University, Pakistan
M.S. from Karlstad University, Sweden
Ph.D. from Dongguk University, South
Korea
Experiences:
Teaching: 4+years experience
Research: 8+years experience
Research Interests:
Computer Vision, Biometrics, Medical
Image Analysis, Computer Aided
Diagnosis, Deep learning, Artificial
Intelligence etc.
4. Sejong University
Week 1: Introduction Applied Machine Learning
About you
• Brief introduction
– Name, nationality, program, research area?
• Why are you taking this course?
• How much you know about Machine Learning?
4
5. Sejong University
Week 1: Introduction Applied Machine Learning
Some Rules
• Following are the class rules
– Raise your hand before asking any question
– Mark your attendance on U_Check App
– There will always be two attendance
• One at the start of the class (late will be
marked if you join after 10 minutes).
• One at the end of the class.
– Never ever Miss a class
– Never ever “sleep” in the class
– Always communicate in the official
communication language.
– Direct all your problems and queries to me.
7. Sejong University
Week 1: Introduction Applied Machine Learning
Books
• “Practical Machine Learning”
by Sunilla Gollaludi
• "Introduction to Machine Learning”
by Alpaydin, E.
8. Sejong University
Week 1: Introduction Applied Machine Learning
Projects/Assignments
• 3 assignment and 2 quizzes will be
taken.
• 1 semester project will be assigned.
• Deadlines are always final
• No credit for late submissions
• Only latest version will be considered
10. Sejong University
Week 1: Introduction Applied Machine Learning
Dishonesty, Plagiarism in Quizzes, Assignments & Projects
• All parties involved in any kind of
cheating in any exam will get zero in
that exam.
11. Sejong University
Week 1: Introduction Applied Machine Learning
Tentative Evaluation Breakdown
Assignments + Quizzes 15
Project 05
Attendance 10
Mid-term 30
Final 40
Total 100
12. Sejong University
Week 1: Introduction Applied Machine Learning
Course Outline
• In this course you will learn:
Comprehensive understanding of the
concepts and techniques used in machine
learning.
Theoretical and practical aspects of machine
learning, including supervised and
unsupervised learning, deep learning etc.
Gain hands-on experience with popular
machine learning libraries and frameworks.
Applying the concepts and techniques
covered in the course to a real-world
problem.
Skills and knowledge to conduct independent
research in the field.
13. Sejong University
Week 1: Introduction Applied Machine Learning
A Few Quotes
“A breakthrough in machine learning would be worth ten Microsofts”
– (Bill Gates, Chairman, Microsoft)
“Machine learning is the next Internet”
– (Tony Tether, Director, DARPA)
“Machine learning is the hot new thing”
– (John Hennessy, President, Stanford)
“Web rankings today are mostly a matter of machine learning”
– (Prabhakar Raghavan, Dir. Research, Yahoo)
“Machine learning is going to result in a real revolution”
– (Greg Papadopoulos, CTO, Sun)
“Machine learning is today’s discontinuity”
– (Jerry Yang, CEO, Yahoo)
13
15. Sejong University
Week 1: Introduction Applied Machine Learning
So What Is Machine Learning (Informally)?
• Automating automation
• Getting computers to program themselves
• Writing software is the bottleneck
• Let the data do the work instead!
15
Traditional Programming Machine Learning
16. Sejong University
Week 1: Introduction Applied Machine Learning
Traditional Programming vs Machine
Learning
• Traditional Programming
• Machine Learning
17. Sejong University
Week 1: Introduction Applied Machine Learning
What Is Machine Learning (Formally)?
"A computer program is said to learn from experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves with experience E."
--Tom M. Mitchell
"Machine learning is the training of a model from data that generalizes a decision against a performance
measure."
--Jason Brownlee
"A branch of artificial intelligence in which a computer generates rules underlying or based on raw data that has
been fed into it."
--Dictionary.com
"Machine learning is a scientific discipline that is concerned with the design and development of algorithms that
allow computers to evolve behaviors based on empirical data, such as from sensor data or databases."
--Wikipedia
17
18. Sejong University
Week 1: Introduction Applied Machine Learning
Why “Learn”?
• Machine learning is programming computers to optimize a performance
criterion using example data or past experience.
• There is no need to “learn” to calculate payroll
• Learning is used when:
– Human expertise does not exist (navigating on Mars),
– Humans are unable to explain their expertise (speech recognition)
– Solution changes in time (routing on a computer network)
18
19. Sejong University
Week 1: Introduction Applied Machine Learning
What We Talk About When We Talk
About “Learning”
• Learning general models from a data of particular examples
• Data is cheap and abundant (data warehouses, data marts); knowledge is expensive
and scarce.
• Build a model that is a good and useful approximation to the data.
19
20. Sejong University
Week 1: Introduction Applied Machine Learning
ML at a Glance
• Machine Learning
– Study of algorithms that
– improve their performance
– at some task
– with experience
• Optimize a performance criterion using example data or past experience.
• Role of Statistics: Inference from a sample
• Role of Computer science: Efficient algorithms to
– Solve the optimization problem
– Representing and evaluating the model for inference
20
21. Sejong University
Week 1: Introduction Applied Machine Learning
• Lots of data is being collected
and warehoused
– Web data, e-commerce
– purchases at department/
grocery stores
– Bank/Credit Card
transactions
• Computers have become cheaper and more powerful
• Competitive Pressure is Strong
– Provide better, customized services for an edge (e.g. in Customer Relationship
Management)
Commercial Motivation for Machine Learning
21
22. Sejong University
Week 1: Introduction Applied Machine Learning
Scientific Motivation for Machine Learning
• Data collected and stored at
enormous speeds (GB/min)
– remote sensors on a satellite
– telescopes scanning the skies
– microarrays generating gene
expression data
– scientific simulations
generating terabytes of data
• Traditional techniques infeasible for raw data
• Machine Learning may help scientists
– in classifying and segmenting data
– in Hypothesis Formation
22
24. Sejong University
Week 1: Introduction Applied Machine Learning
Example: Predicting how a viewer will rate a movie
• 10% improvement = 1 million dollar prize
• The essence of machine learning
– A pattern exists
– We cannot pen it down mathematically
– We have data on it
24
25. Sejong University
Week 1: Introduction Applied Machine Learning
Movie rating – a solution
25
http://work.caltech.edu/slides/slides01.pdf
26. Sejong University
Week 1: Introduction Applied Machine Learning
The Learning Approach
26
http://work.caltech.edu/slides/slides01.pdf
27. Sejong University
Week 1: Introduction Applied Machine Learning
Components of Learning
27
http://work.caltech.edu/slides/slides01.pdf
28. Sejong University
Week 1: Introduction Applied Machine Learning
Components of Learning
Formalization
28
http://work.caltech.edu/slides/slides01.pdf
31. Sejong University
Week 1: Introduction Applied Machine Learning
A simple hypothesis set
31
http://work.caltech.edu/slides/slides01.pdf
32. Sejong University
Week 1: Introduction Applied Machine Learning
Types of Learning
• Supervised (inductive) learning
– Training data includes desired outputs
• Unsupervised learning
– Training data does not include desired outputs
• Reinforcement learning
– Rewards from sequence of actions
32
33. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised & Unsupervised Learning
Supervised Learning Unsupervised Learning
• Labeled dataset
• Establish relationship between input and
output
• Generate output for new data points
• Reliable models but expensive and limited
• Classification: Associative classifiers,
Decision Trees, Instance Learning,
Bayesian Learning, Kernel machines,
Neural Networks, Genetic Algorithms, etc
• Regression: Linear Regression, …
• Unlabeled dataset
• Decipher structure of the data
• Output attributes are not defined
• Clustering: Kmeans, DBScan, Hierarchical
algorithms, Self Organizing Maps, etc.
• Associations: Apriori algorithms, FP-
Growth algorithms, …[useful in data mining]
Reinforcement Learning
• Maximizing the rewards from the results
• Also called credit assessment learning
• Additional decision about rewards
• Explore the tradeoff between exploring and
exploiting the data
33
34. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised Learning
• In supervised learning, the training data provided to the machines work as the supervisor that
teaches the machines to predict the output correctly. It applies the same concept as a student
learns in the supervision of the teacher.
• Supervised learning is a process of providing input data as well as correct output data to the
machine learning model. The aim of a supervised learning algorithm is to find a mapping
function to map the input variable(x) with the output variable(y).
35. Sejong University
Week 1: Introduction Applied Machine Learning
Steps Involved in Supervised
Learning
• First Determine the type of training dataset
• Split the training dataset into training dataset, test dataset, and validation dataset.
• Determine the input features of the training dataset, which should have enough knowledge
so that the model can accurately predict the output.
• Determine the suitable algorithm for the model, such as support vector machine, decision
tree, etc.
• Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.
• Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.
36. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised learning -
Classification
Goal: Learning a function for a categorical output.
E.g.: Spam filtering. The output (“Spam?”) is binary.
Gender. The output is (Male/Female?) is binary.
Sender in
address
book?
Header
keyword
Word 1 Word 2
…
Spam?
x1 Yes Schedule Hi Profesor … No
x2 Yes meeting Joelle I … No
x3 No urgent Unsecured Business … Yes
x4 No offer Hello I … Yes
x5 No cash We’ll Help … Yes
x6 No comp-551 Dear Professor … No
…
37. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised Learning: Classification
• a way to identify a grouping technique for a given dataset
• depending on a value of the target or output attribute, the entire dataset can be qualified to
belong to a class
• this technique helps in identifying the data behavior patterns
Total Items Purchased
Total
Money
Spent
Determine good or bad customers?
All the customers who spend more than 800 dollars in a single purchase are categorized as good customers.
37
38. Sejong University
Week 1: Introduction Applied Machine Learning
Classification: Definition
• Given a collection of records (training set )
– Each record contains a set of attributes, one of the attributes is the
class.
• Find a model for class attribute as a function of the values of other
attributes.
• Goal: previously unseen records should be assigned a class as accurately
as possible.
– A test set is used to determine the accuracy of the model.
– Usually, the given data set is divided into training and test sets
– with training set used to build the model and test set used to validate it.
38
39. Sejong University
Week 1: Introduction Applied Machine Learning
Classification Example
Tid Refund Marital
Status
Taxable
Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
Refund Marital
Status
Taxable
Income Cheat
No Single 75K ?
Yes Married 50K ?
No Married 150K ?
Yes Divorced 90K ?
No Single 40K ?
No Married 80K ?
10
Test
Set
Training
Set
Model
Learn
Classifier
40. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised learning -
Regression
Goal: Learning a function for a continuous output.
E.g.: Self-driving car speed control. The output (“speed”) is continuous.
41. Sejong University
Week 1: Introduction Applied Machine Learning
Supervised: Regression
• Predict a value of a given continuous valued variable based on the values of
other variables, assuming a linear or nonlinear model of dependency.
• Greatly studied in statistics, neural network fields.
• Examples:
– Predicting sales amounts of new product based on advertising
expenditure.
– Predicting wind velocities as a function of temperature, humidity, air
pressure, etc.
– Time series prediction of stock market indices.
41
42. Sejong University
Week 1: Introduction Applied Machine Learning
Unsupervised Learning
• Unsupervised learning is a type of machine learning in which models are trained using
unlabeled dataset and are allowed to act on that data without any supervision.
• Models itself find the hidden patterns and insights from the given data.
• Unsupervised learning cannot be directly applied to a regression or classification problem
because unlike supervised learning, we have the input data but no corresponding output
data.
• The goal of unsupervised learning is to find the underlying structure of dataset, group that
data according to similarities, and represent that dataset in a compressed format.
43. Sejong University
Week 1: Introduction Applied Machine Learning
Why use Unsupervised Learning?
Unsupervised learning is helpful for finding useful insights from the data.
Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.
44. Sejong University
Week 1: Introduction Applied Machine Learning
Unsupervised learning
Goal: Learning a function over the input alone.
E.g. Organizing data into classes. Inferring distances between data points.
45. Sejong University
Week 1: Introduction Applied Machine Learning
Unsupervised: Clustering
• Given a set of data points, each having a set of attributes, and a similarity measure
among them, find clusters such that
– Data points in one cluster are more similar to one another.
– Data points in separate clusters are less similar to one another.
• Similarity Measures:
– Euclidean Distance if attributes are continuous.
– Other Problem-specific Measures.
45
46. Sejong University
Week 1: Introduction Applied Machine Learning
Illustrating Clustering
Euclidean Distance Based Clustering in 3-D space.
Intracluster distances
are minimized
Intercluster distances
are maximized
47. Sejong University
Week 1: Introduction Applied Machine Learning
Clustering: Application
• Document Clustering:
– Goal: To find groups of documents that are similar to each other based on the important terms
appearing in them.
– Approach: To identify frequently occurring terms in each document. Form a similarity measure
based on the frequencies of different terms. Use it to cluster.
– Gain: Information Retrieval can utilize the clusters to relate a new document or search term to
clustered documents.
47
Example: Google Scholar
48. Sejong University
Week 1: Introduction Applied Machine Learning
Unsupervised Learning: Association Rule Discovery: Application
• Supermarket shelf management.
– Goal: To identify items that are bought together by sufficiently many customers.
– Approach: Process the point-of-sale data collected with barcode scanners to find dependencies
among items.
– A classic rule --
• If a customer buys diaper and milk, then he is very likely to buy beer.
• So, don’t be surprised if you find six-packs stacked next to diapers!
48
49. Sejong University
Week 1: Introduction Applied Machine Learning
Reinforcement learning
Goal: Learning a sequence of actions that optimizes costs/rewards.
E.g.: Balancing an inverted pendulum.
50. Sejong University
Week 1: Introduction Applied Machine Learning
Reinforcement Learning
• Topics:
– Policies: what actions should an agent take in a particular situation
– Utility estimation: how good is a state (used by policy)
• No supervised output but delayed reward
• Credit assignment problem (what was responsible for the
outcome)
• Applications:
– Game playing
– Robot in a maze
– Multiple agents, partial observability, ...
50
52. Sejong University
Week 1: Introduction Applied Machine Learning
Most of the knowledge in the world in the future is going to be
extracted by machines and will reside in machines.
Yann LeCun, Director of AI Research, Facebook
52
53. Sejong University
Week 1: Introduction Applied Machine Learning
So How Do Computers Discover New Knowledge?
1. Fill in gaps in existing knowledge
2. Emulate the brain
3. Simulate evolution
4. Systematically reduce uncertainty
5. Notice similarities between old and new
54. Sejong University
Week 1: Introduction Applied Machine Learning
The Five Tribes of Machine Learning
54
Pedro Domingos, University of Washington
https://learning.acm.org/webinar_pdfs/PedroDomingos_FTFML_WebinarSlides.pdf