Selected Topics in CS-CHapter-twooo.pptx

04/05/2025 1
Machine learning & Data Mining Applications
Machine Learning
• Basics of ML(Machine Learning)
• Types of Machine Learning
• Machine Learning Algorithms
Data Mining Applications
• Introduction
• Data Mining Algorithms
•

04/05/2025 2
Basics of ML(Machine Learning)
Defn: Machine Learning(ML)
 is a system of computer algorithms that can learn from
example through self-improvement without being
explicitly coded by a programmer.
 optimizes performance criterion using example data or
past experience.
 is a part of artificial Intelligence which combines data
with statistical tools to predict an output which can be
used to make actionable insights.
 Machine learning is closely related to data mining and
Bayesian predictive modeling.

04/05/2025 3
Basics of ML(Machine Learning)…
Machine Learning(ML)VSTraditional Programming
 In traditional programming,
◦ a programmer code all the rules in consultation with an expert
in the industry for which software is being developed.
◦ the machine will execute an output following the logical
statement.
◦ When the system grows complex, more rules need to be
written hence ti become unsustainable to maintain.

04/05/2025 4
Machine Learning(ML)VSTraditional Programming
 In Machine Learning,
◦ The machine learns how the input and output data are
correlated and it writes a rule.
◦ The programmers do not need to write new rules each time
there is new data.
◦ The algorithms adapt in response to new data and experiences
to improve efficacy over time.

 How well does a learned model generalize from the
data it was trained on to a new test set?
Training set (labels known) Test set (labels
unknown)
Slide credit: L. Lazebnik
 Example 1.
 Consider the following images

04/05/2025 6
 Example 2.
 In the following labeled data, the examples are used as data
set to train the machine, and then to predict type of a new
fruit based on given characterstics.

 Example 3.
 Here the data set can be used to train a model that would predict
whether an application for credit should be approved, based on
age, job status, house, and credit rating.

 From the above example 3, a model will be built
and then it will be used for prediction of new un-
seen future data?
◦ Learn a classification model from the data
◦ Use the model to classify future loan applications into
 Yes (approved) and
 No (not approved)
 What is the class for following case/instance?

04/05/2025 9
 In the above Example 3, in predicting the new future’s
data classification, it could be rated to majority class
which gives accuracy of 9/15=60% this is without
training or machine learning.
 But the machine will attempt to do better than 60%

 Two Steps:
◦ Learning (training): Learn a model using the training
data
◦ Testing: Test the model using unseen test data to
assess the model accuracy
,
cases
test
of
number
Total
tions
classifica
correct
of
Number

Accuracy

04/05/2025 11
 Consider the following is sample data for test from the
previous dataset example
 If the model predicts the classes in the following
manner, then what is the Accuracy of the model?
Age Has_Job Own_House Credit_Rating
Actual
Class
Old TRUE FALSE Good Yes
Old TRUE FALSE excellent Yes
Old FALSE FALSE Fair No
Age Has_Job Own_House Credit_Rating
Predicte
d Class
Old TRUE FALSE Good Yes
Old TRUE FALSE excellent No
Old FALSE FALSE Fair Yes

04/05/2025 12
 The accuracy =1/3(True classications/TotalTest Data)
 The confusion matrix is used to visualizes the accuracy of
a classifier by comparing the actual and predicted classes.
 The binary confusion matrix is composed of squares:
 It evaluates the performance of the classification models,
when they make predictions on test data, and tells how
good our classification model is.

04/05/2025 13
 The confusion matrix: is a performance measurement technique for
Machine learning classification.
 It is a kind of table which helps to describe the performance of the
classification model on a set of test data .
 True Negative: Model has given prediction No, and the real or actual value
was also No.
 True Positive: The model has predicted yes, and the actual value was also true.
 False Negative: The model has predicted no, but the actual value wasYes, it is
also called as Type-II error.
 False Positive: The model has predictedYes, but the actual value was No. It is
also called a Type-I error.
n=Total Preductions
Actual
Yes No Total
Predicted
Yes TP FP P'
No FN TN N'
P N P+N

04/05/2025 14
 Hence for the above test data the confusion matrix
looks like:
=(1+0)/(1+0+1+1)===1/3
◦ =1/(1+1)=1/2
n=3
Actual
Yes No Total
Predicted
Yes 1 1 P‘=2
No 1 0 N‘=1
P=2 N=1 P+N=3

Testing
Training
Generalization Error
Number of Training Examples
Error
Fixed prediction model
Slide credit: D. Hoiem
 As the data set(training & test data) increases the
model’s error would decrease and vise versa.

04/05/2025 16
Types of Machine Learning
 Machine learning can be grouped into two
broad learning tasks: Supervised and Unsupervised.
◦1/ Supervised Learning
 An algorithm uses training data and feedback from humans to
learn the relationship of given inputs to a given output.
 It uses labelled dataset to class catageries.
◦ There are two categories of supervised
learning:
 Classification task
 Regression task

04/05/2025 17
Types of Machine Learning…
 Classification
◦ Imagine you want to predict the gender of a customer for a commercial.
◦ You will start gathering data on the height, weight, job, salary, purchasing
basket, etc. from your customer database.
◦ You know the gender of each of your customer, it can only be male or
female.
◦ The objective of the classifier will be to assign a probability of being a
male or a female (i.e., the label) based on the information (i.e., features
you have collected).
◦ When the model learned how to recognize male or female, you can use
new data to make a prediction.
◦ For instance, you just got new information from an unknown customer,
and you want to know if it is a male or female. If the classifier predicts
male = 70%, it means the algorithm is sure at 70% that this customer is a
male, and 30% it is a female.

04/05/2025 18
 Regression
◦ When the output is a continuous value, the task is a
regression.
◦ For instance, a financial analyst may need to forecast
the value of a stock based on a range of feature like
equity, previous stock performances, macroeconomics
index.
◦ The system will be trained to estimate the price of
the stocks with the lowest possible error.

04/05/2025 19
 Regression Example
 Consider the values of two variables for eg. Income (y) and
investment(x) is which is obtained from campany dataset is
mapped on a graph as shown below.
 The machine can be trained to develop a regular pattern or
formula that will represent the income in terms of investment:
y=a + bx [Where y=income, x=investment,a=constant, b=coeffecient]

04/05/2025 20
 Unsupervised learning
◦ Learning a model from unlabeled data.(e.g., explores
customer demographic data to identify patterns)
◦ You can use it when you do not know how to
classify the data, and you want the algorithm
to find patterns and classify the data for you
◦ Eg. Clustering: segmenting data set according
to their features.

04/05/2025 21
Machine Learning Algorithms
 Following are examples of Algorithms that support
supervised Machine Learning

04/05/2025 22
Machine Learning Algorithms…
 Following are examples of Algorithms that support
Unsupervised Machine Learning

04/05/2025 23
 Linear Regression
◦ is type of supervised machine learning that uses predictor
variable and dependant variable.
◦ Applies to variables of continousValues.
 Logistic Regression
◦ Logistic regression is one of the most popular machine learning
algorithms for binary classification.
◦ In linear regression theY variable is always a continuous variable.
If suppose, the Y variable was categorical, you cannot use linear
regression model it.
◦ Logistic regression is a classic predictive modelling technique
and still remains a popular choice for modelling binary
categorical variables.

04/05/2025 24
 Logistic Regression..
◦ Unlike linear regression, the output is transformed into a
probability using the logistic function:
◦ Logistic regression achieves this by taking the log odds of
the event log(P/1-P),
 where, P is the probability of event. So P always lies between 0
and 1.
◦ Logistic regressions work with odds which are simply the
ratio of the proportions for the two possible outcomes.
◦ The log odds are modeled as a linear function of the
explanatory variable:

04/05/2025 25
 Decision Tree
◦ IsType of Supervised Machine learning Algorithm
◦ Creates a model that predicts the value of a target variable, for
which the decision tree uses the tree representation to solve
the problem.
◦ In decision tree the leaf node corresponds to a class label and
attributes are represented on the internal node of the tree.

04/05/2025 26
 DecisionTree Example
 Considering the data set below with target
class play, then it will generate a decision tree.

04/05/2025 27
 Decision Tree Example..

04/05/2025 28
 Popular DecisionTree algorithms include:
 ID3 (extension of D3)
→
C4.5 (successor of ID3)
→
CART (Classification And Regression Tree)
→
CHAID (Chi-square automatic interaction detection Performs
→
multi-level splits when computing classification trees)
MARS (multivariate adaptive regression splines)
→

04/05/2025 29
 K-Means Clustering
◦ IsType of UnSupervised Machine learning Algorithm
◦ It make inferences from datasets using only input without
referring to known, or labelled, outcomes.
◦ In K means clustering you’ll define a target number k, which
refers to the number of centroids you need in the dataset.
◦ A centroid is the imaginary or real location representing the
center of the cluster.
Every data point is allocated to each of the clusters through
reducing the in-cluster sum of squares.
◦ The ‘means’ in the K-means refers to averaging of the data; that
is, finding the centroid.

04/05/2025 30
 Aritificial Neural Network
◦ A Neural Network is asystem composed of many simple
processing elements operating in parallel which can acquire, store,
and utilize experiential knowledge.
◦ A nerve cell neuron (in biological neuron) is a special biological
cell that processes information(having similarities with the ANN:
for example dendrites—same to input neuron ,Axions—same to
output neurons in ANN).
◦ The learning algorithm of a neural network can either be
supervised or unsupervised.
◦ A neural net is said to learn supervised, if the desired output is
already known otherwise it is Unspervised.
◦ Unsupervised Learnig algorithms of ANN are more applicable in
vast areas.

04/05/2025 31
 Aritificial Neural Network
◦ Computational models inspired by the human brain:
 Algorithms that try to mimic the brain.
 Massively parallel, distributed system, made up of simple
processing units (neurons)
 Synaptic connection strengths among neurons are used to
store the acquired knowledge.
 Knowledge is acquired by the network from its
environment through a learning process

04/05/2025 32
 An artificial neural network
◦ has dozens of artificial neurons—called units—
arranged in a series of layers.
◦ The input layer receives various forms of
information from the outside world.
◦ This is the data that the network aims to process
or learn about.
◦ From the input unit, the data goes through one or
more hidden units.
◦ The hidden unit’s job is to transform the input
into something the output unit can use.

04/05/2025 33
 Artificial Neural Network
◦ Training Neural Networks is a NON
CONVEX OPTIMIZATION PROBLEM.
◦ This means we can run into many local
optima during training.
◦ We need to first perform a forward pass
◦ Then, we update weights with a backward
pass(Back ward Prop)

04/05/2025 34
 ANNs have been widely used in various domains for:
 Pattern recognition
 Function approximation
 Associative memory

04/05/2025 35
 Properties of ANN
◦ Inputs are flexible
 any real values
 Highly correlated or independent
◦ Target function may be discrete-valued, real-valued, or
vectors of discrete or real values
 Outputs are real numbers between 0 and 1
◦ Long training time & Fast evaluation

Different Network Topologies
 Single layer feed-forward networks
◦ Input layer projecting into the output layer
Input Output
layer layer
Single layer
network

 Multi-layer feed-forward networks
◦ One or more hidden layers. Input projects
only from previous layers onto a layer.
Input Hidden Output
layer layer layer
2-layer or
1-hidden layer
fully connected
network

 Recurrent networks
◦ A network with feedback, where some
of its inputs are connected to some of its
outputs (discrete time).
Input Output
layer layer
Recurrent
network

04/05/2025 39
 Deep Learning ANN
◦ It is type of ANN with many hidden layers.
◦ Deep Learning is nothing but stacking multiple
such hidden layers between the input and the
output layer, hence the name Deep learning

• A machine learning subfield of learning representations of data.
• Deep learning algorithms attempt to learn (multiple levels of) representation
by using a hierarchy of multiple layers
• If you provide the system tons of information, it begins to understand it
and respond in useful ways.

 Deep Learning ANN
◦ It is type of ANN with many hidden
layers.

04/05/2025 42
 Deep Learning Methods
◦ There are various methods designed to apply deep learning.
◦ Each proposed method has a specific use case like
 the kind of data you have, whether it is supervised or
unsupervised learning
 type of task you would want to solve with the data.
◦ So depending on these factors, you choose one of the methods
that can best solve your problem.
◦ Some of the deep learning methods are:
 Convolutional Neural Network,
 Recurrent Neural Network,
 Long short-term memory,

04/05/2025 43
Data Mining--Introducation
 Data mining (knowledge discovery from data)
◦ Extraction of interesting (non-trivial, implicit, previously
unknown and potentially useful) patterns or knowledge from
huge amount of data
◦ Data mining: a misnomer?
 Alternative names
◦ Knowledge discovery (mining) in databases (KDD),
◦ knowledge extraction,
◦ data/pattern analysis,
◦ data archeology, data dredging, information harvesting, business
intelligence, etc.

CS590D 44
Data Mining—Introducation…
Data Mining
Knowledge Mining
Knowledge Discovery
in Databases
Data Dredging
Database Mining
Knowledge Extraction
Data Pattern Processing
Information Harvesting
Siftware
The process of discovering meaningful new correlations, patterns,
and trends by sifting through large amounts of stored data, using
pattern recognition technologies and statistical and mathematical
techniques
Data Archaeology
Alternate Names for Data mining:

46
Data Mining—Introducation…
 Data mining is the analysis of (often large) observational
data sets to find unsuspected relationships and to
summarize the data in novel ways that are both
understandable and
useful to the data owner
 The relationships and summaries derived through a data
mining exercise are often referred to as models or patterns.
Examples include
◦ linear equations,
◦ rules, clusters,
◦ graphs,
◦ tree structures, and
◦ recurrent patterns in time series.

47
Data Mining—Introducation..
 Data mining typically deals with data that have already been
collected for some purpose other than the data mining
analysis (for example, they may have been collected in order
to maintain an up-to-date record of all the transactions in a
bank).
 This means that the objectives of the data mining exercise do
not focus on the data collection strategy.
 This is one way in which data mining differs from much of
statistics, in which data are often collected by using efficient
strategies to answer specific questions.
 For this reason, data mining is often referred to as
"secondary" data analysis.

48
 The Cross-Industry Standard Process for Data Mining
(CRISP–DM) [16] was developed in 1996 by analysts
provides a nonproprietary and freely available standard
process for fitting data mining into the general problem-
solving strategy of a business or research unit.
 According to CRISP–DM, a given data mining project
has a life cycle consisting of six phases, as illustrated in
Figure below.

49
 CRISP-DM

50
 Classification: predicting an item class
 Clustering: finding clusters in data
 Associations: e.g.A & B & C occur frequently
 Outliner Detection: finding changes
 Regression Analysis: predicting a
continuous value
 Trend Analysis: finding relationships
 …

51
Data mining—Potential Applications
 Data analysis and decision support
◦ Market analysis and management
 Target marketing, customer relationship management (CRM), market
basket analysis, cross selling, market segmentation
◦ Risk analysis and management
 Forecasting, customer retention, improved underwriting, quality
control, competitive analysis
◦ Fraud detection and detection of unusual patterns (outliers)
 Other Applications
◦ Text mining (news group, email, documents) andWeb mining
◦ Stream data mining
◦ DNA and bio-data analysis

52
Data mining—Potential Applications..
 Market Analysis and Management
◦ Target marketing
 Find clusters of “model” customers who share the same
characteristics: interest, income level, spending habits, etc.
 Determine customer purchasing patterns over time
◦ Cross-market analysis
 Associations/co-relations between product sales, & prediction based
on such association
◦ Customer profiling
 What types of customers buy what products (clustering or
classification)
◦ Customer requirement analysis
 identifying the best products for different customers
 predict what factors will attract new customers
◦ Provision of summary information
 multidimensional summary reports
 statistical summary information (data central tendency and variation)

53
 Corporate Analysis & Risk Management
◦ Finance planning and asset evaluation
 cash flow analysis and prediction
 contingent claim analysis to evaluate assets
 cross-sectional and time series analysis (financial-ratio, trend
analysis, etc.)
◦ Resource planning
 summarize and compare the resources and spending
◦ Competition
 monitor competitors and market directions
 group customers into classes and a class-based pricing
procedure
 set pricing strategy in a highly competitive market

54
 Fraud Detection & Mining Unusual Patterns
◦ Approaches: Clustering & model construction for frauds, outlier
analysis
◦ Applications: Health care, retail, credit card service, telecomm.
 Auto insurance: ring of collisions
 Money laundering: suspicious monetary transactions
 Medical insurance
 Professional patients, ring of doctors, and ring of references
 Unnecessary or correlated screening tests
 Telecommunications: phone-call fraud
 Phone call model: destination of the call, duration, time of day or week.
Analyze patterns that deviate from an expected norm
 Retail industry
 Analysts estimate that 38% of retail shrink is due to dishonest employees
 Anti-terrorism

55
Data Mining Algorithms
 As Data mining is considered to be the
process of extracting useful information from
a vast amount of data, it employs data
preparation, training, testing and
interperation and deployment steps.
 It mainly involves machine learning
techniques.
 So while data mining needs machine
learning, machine learning doesn’t necessarily
need data mining.

56
Data Mining Algorithms
 Hence the machine learning algorithms
are used also in data mining applications.
 These includes:
◦ Decision tree
◦ k-Means
◦ linear regression
◦ ANN(Artificial Neural Networks)
◦ etc.

57
Data Mining& Machine Learning Research
Areas
 Following are some thematic areas for
data mining researches:
◦ Fraud Detection(for banks, telecom, etc)
◦ Market Analysis/Stock Market Analysis
◦ Customer trend analysis(CRM)
◦ Financial Analysis(Profit analysis)
◦ Website Evaluation(Web mining)
◦ Weather Forecasting using Data Mining

58
Data Mining& Machine Learning Research Areas…
 Fraud Detection
◦ The number of frauds in daily life is increasing in sectors like
banking, finance, and government.
◦ Accurate detection of fraud is a challenge.
◦ Data mining techniques help in anticipation and detection of
fraud.
◦ Data mining tools can be used to spot patterns and detect fraud
transactions.
◦ Through data mining, factors leading to fraud can be determined.

59
 Web Mining
◦ Web Mining is an application of Data Mining and an important
research area.
◦ It is a technique to discover patterns from WWW i.e World
WideWeb.
◦ The information for web mining is collected through browser
activities, page content and server logins.
◦ There are three types ofWeb Mining:
 Web Usage Mining
 Web Content Mining
 Web Structure Mining

60
 Opinion Mining
◦ Opinion mining, also known as sentiment mining, is a natural language
processing method to analyze the sentiments of customers about a
particular product.
◦ It is widely used in areas like surveys, public reviews, social media,
healthcare systems, marketing etc.
◦ Automated opinion mining employs machine learning algorithms to
analyze the sentiments.

Selected Topics in CS-CHapter-twooo.pptx

More Related Content

Similar to Selected Topics in CS-CHapter-twooo.pptx

Recently uploaded

Selected Topics in CS-CHapter-twooo.pptx