SCATTER PLOT DIAGRAM
Karthik M
Scatter plot diagram
• A scatter plot is a mathematical diagram using Cartesian
coordinates to display values for typically two variables for a set of
data.
When to use Scatter diagram
• Paired numerical data
• To determine whether the two variable are related
• To identify potential root-cause of problems (whether cause and
effect are related)
• In regression model building, to decide whether to consider
variables for regression
Positive correlation
Correlation describes the type of relationship between two data sets.
Positive correlation is a relationship between two variables in which
both variables move in the same direction
Age of car (years)
Valueofcar(₹)
Negative correlation is a relationship
between two variables in which one
variable increases as the other decreases,
and vice versa.
Example:
1. As a car gets older it is worth less
2. The weight of a truck (T) and mileage
(kmpl)
Negative correlation
No correlation
no relationship, connection
between the two variables
Correlation = 0
Why to use Scatter diagram
• Plotting a scatter diagram is first step in looking for relationship
between two variables
• Assessing strength of relationship (strong, neutral, weak)
• Exploring trend and making prediction
• To find out any outliers
Trend
• The line of best fit is the line that
comes closest to all the points
on a scatter plot.
• Lines of best fit can be used to
make predictions
• How much money we can
predict to collect if 60 visitors
attended?
No correlation
Finding outliers
Outliers are observations which are far
enough away from the mean that they are
noticeably different.
Away from the cluster
Application:
• Might be errors in data entry
• They indicators that something is wrong
with our model.
• To detect fraudlet
We can discard them without harm to study
With outlier: average = 2.8
Without outlier: average = 2.1
Positive correlation Negative correlation
as one data set
increase
decreases.
No correlation
• If variables are correlated, points fall in a line or curve
• Better the correlation, tighter the points will hug the line
Reading a Scatter diagram
Scatter plot diagram

Scatter plot diagram

  • 1.
  • 2.
    Scatter plot diagram •A scatter plot is a mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data.
  • 3.
    When to useScatter diagram • Paired numerical data • To determine whether the two variable are related • To identify potential root-cause of problems (whether cause and effect are related) • In regression model building, to decide whether to consider variables for regression
  • 5.
    Positive correlation Correlation describesthe type of relationship between two data sets. Positive correlation is a relationship between two variables in which both variables move in the same direction
  • 6.
    Age of car(years) Valueofcar(₹) Negative correlation is a relationship between two variables in which one variable increases as the other decreases, and vice versa. Example: 1. As a car gets older it is worth less 2. The weight of a truck (T) and mileage (kmpl) Negative correlation
  • 7.
    No correlation no relationship,connection between the two variables Correlation = 0
  • 8.
    Why to useScatter diagram • Plotting a scatter diagram is first step in looking for relationship between two variables • Assessing strength of relationship (strong, neutral, weak) • Exploring trend and making prediction • To find out any outliers
  • 9.
    Trend • The lineof best fit is the line that comes closest to all the points on a scatter plot. • Lines of best fit can be used to make predictions • How much money we can predict to collect if 60 visitors attended? No correlation
  • 10.
    Finding outliers Outliers areobservations which are far enough away from the mean that they are noticeably different. Away from the cluster Application: • Might be errors in data entry • They indicators that something is wrong with our model. • To detect fraudlet We can discard them without harm to study With outlier: average = 2.8 Without outlier: average = 2.1
  • 11.
    Positive correlation Negativecorrelation as one data set increase decreases. No correlation • If variables are correlated, points fall in a line or curve • Better the correlation, tighter the points will hug the line Reading a Scatter diagram