Introduction to data visualization tools like Tableau and Power BI and Excel

An Introduction to Popular Tools,
Machine Learning, and Visualization

Replay
Introduction to Data Science Tools (R, SQL)
• SQL Commands and basic Handson command line interface
• R installation and Handson

Session 6 : Machine Learning
Agenda:
What is Machine Learning
Categories of Machine Learning
Common Algorithms
 Linear Regression
 Naïve Bayes
 SVM
 Decision Tree
 KNN
 Random Forest
 K- Means Clustering
Machine Learning Process
Real world use case

What is Machine Learning
Machine learning is a sub-field of A
rtificial Intelligence in which computers provide
predictions based on patterns learned directly from
data without being explicitly programmed to do so.

Categories of Machine Learning

What is an Algorithm ?
Algorithms in machine
learning
are mathematical
procedures and
techniques that allow
computers to learn
from data, identify
patterns, make
predictions, or perform
tasks without explicit

Common Algorithms – Linear
Regression
Linear regression algorithm shows
a linear relationship between a
dependent (y) and one or more
independent (x) variables, hence
called as linear regression. Since
linear regression shows the linear
relationship, which means it finds
how the value of the dependent
variable is changing according to
the value of the independent
variable.

Common Algorithms – Naïve
Baye’s
Naïve Bayes algorithms
calculate the probability
that an event will occur,
based on the occurrence
of a related event

Common Algorithms – SVM
• The goal of the SVM algorithm is to
create the best line or decision boundary
that can segregate n-dimensional space
into classes so that we can easily put the
new data point in the correct category in
the future. This best decision boundary is
called a hyperplane.
• SVM chooses the extreme points/vectors
that help in creating the hyperplane.
These extreme cases are called as
support vectors, and hence algorithm is
termed as Support Vector Machine

Common Algorithms – Decision
Tree
It is a tree-structured classifier, where internal
nodes represent the features of a dataset,
branches represent the decision rules and each
leaf node represents the outcome.
•The decisions or the test are performed on the
basis of features of the given dataset.
•It is a graphical representation for getting all
the possible solutions to a problem/decision
based on given conditions.

Common Algorithms – KNN
• K-NN algorithm assumes the similarity
between the new case/data and available
cases and put the new case into the
category that is most similar to the available
categories.
• K-NN algorithm stores all the available data
and classifies a new data point based on the
similarity. This means when new data
appears then it can be easily classified into a
well suite category by using K- NN
algorithm.
• It is also called a lazy learner
algorithm because it does not learn from
the training set immediately instead it
stores the dataset and at the time of

Common Algorithms – Random
Forest
• Random Forest is a classifier that
contains a number of decision trees
on various subsets of the given
dataset and takes the average to
improve the predictive accuracy of
that dataset.
• It is based on the concept
of ensemble learning, which is a
process of combining multiple
classifiers to solve a complex problem
and to improve the performance of
the model.

Common Algorithms – K Means
Clustering
• K-Means Clustering is an
Unsupervised Learning algorithm, which
groups the unlabeled dataset into different
clusters. Here K defines the number of pre-
defined clusters that need to be created in the
process, as if K=2, there will be two clusters,
and for K=3, there will be three clusters, and so
on.
• It is a centroid-based algorithm, where each
cluster is associated with a centroid. The main
aim of this algorithm is to minimize the sum of
distances between the data point and their
corresponding clusters.

Machine Learning Process
• Step 1: Collect and prepare the data
• Step 2: Train the model
• Step 3: Validate the model
• Step 4: Interpret the results

Categories use case in real world

Q What are the main
differences between
supervised and
unsupervised learning?

Q How does a linear
regression model make
predictions

Q In what scenarios would
you prefer using a Decision
Tree over a Random Forest?

Introduction to data visualization tools like Tableau and Power BI and Excel

More Related Content

Similar to Introduction to data visualization tools like Tableau and Power BI and Excel

More from Lipika Sharma

Recently uploaded

Introduction to data visualization tools like Tableau and Power BI and Excel