1. Henry R. Kang (1/2010)
General Chemistry
Lecture 6
Graphing
2. Henry R. Kang (7/2008)
Outlines
• Logarithm and Exponent
Definitions and rules
• Graphing
Graphing rules
Line drawing
• Linear Regression
Deviation of equations
Goodness of data fitting
Draw the least-square line
4. Henry R. Kang (7/2008)
Definition of Exponent
• Exponent is expressed as the power of a base
value.
Y = BX
.
B is the base and X is the power.
Any positive number greater than 1 can be used as
the base.
The power X can be a number or a more complex
expression such as X = a/4 or X = (a+b)/2.
Commonly used bases are 10 and e = 2.7182818…
(e is an irrational number).
Computer uses binary system; the base is 2.
7. Henry R. Kang (7/2008)
Definition of Logarithm
• Logarithm is the inverse of the exponent.
• log(base)Y = X.
Two popular bases are 10 and e.
• Common (or Briggsian) logarithm uses the base value of 10.
log10Y = X or log Y = X.
Some values of common logarithm:
log(1) = 0; log(10) = 1; log(100) = 2; etc.
• Natural (or hyperbolic) logarithm uses the base value of
e = 2.7182818…
logeY = X or lnY = X.
• Common and natural logarithms can be inter-converted.
lnY = 2.30259 × logY or logY = lnY / 2.30259.
8. Henry R. Kang (7/2008)
Rules of Logarithms
• The following rules apply to both common and natural
logarithms:
• log(A×B) = logA + logB.
log(5×7) = log(5) + log(7) = 0.698970004 + 0.84509804 = 1.544068044
log(35) = 1.544068044
• log(A/B) = logA – logB.
log(5/7) = log(5) – log(7) = 0.698970004 - 0.84509804 = -0.146128036
log(5/7) = log(0.714285714) = -0.146128036
• log(An
) = n logA.
log(73
) = 3 log(7) = 2.53529412
log(73
) = log(343) = 2.53529412
• log(A-n
) = -n logA.
• log(A1/n
) = (1/n) logA.
10. Henry R. Kang (7/2008)
Advantages of Graphing
• A graph or figure is a very powerful means
of delivering information.
The information is very compactly
represented.
The relationship between parameters is clearly
shown.
The general characteristics of the parameters
can be derived.
• A picture is worth a thousand words.
11. Henry R. Kang (7/2008)
Graphing Rules – Label & Size
• Graph should be neatly presented,
easily readable and properly titled.
Each axis should be clearly labeled with
the name of the parameter and
the unit.
Scales should be selected so that the
actual graph covers at least 50% of the
space available.
12. Henry R. Kang (7/2008)
Example – Correct Size
0
200
400
600
800
1000
1200
1400
0 10 20 30 40 50 60 70 80
Tem perature (C)
Pressure(mmH2O)
13. Henry R. Kang (7/2008)
Example – Incorrect Size
0
200
400
600
800
1000
1200
1400
0 10 20 30 40 50 60 70 80
Temperature (C)
Pressure(mmH2O)
14. Henry R. Kang (7/2008)
Graphing Rules - Axis
• An axis scale does not need to
start at “zero”
To avoid data points cluster in a
narrow range
Exception
If the extrapolation of the line is
required to the x-axis or y-axis
intercept.
15. Henry R. Kang (7/2008)
Data Cluster - Incorrect
0
1
2
3
4
5
6
7
8
9
10
0 5 10 15 20
Tem perature (C)
Volume(mL)
16. Henry R. Kang (7/2008)
Expand the Scale
0
1
2
3
4
5
6
7
8
9
10
16 16.5 17 17.5 18 18.5 19
Tem perature (C)
Volume(mL)
17. Henry R. Kang (7/2008)
Extrapolation
0
1
2
3
4
5
6
7
8
9
-10 -5 0 5 10 15 20
X
Y
18. Henry R. Kang (7/2008)
Graphing Rules - Divisions
• Scale on the graph
paper should have
divisions that are
easily “divided by
the eye”
1, 2, 5, or 10
Not 3, 6, 7, or 11
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20
0 2 4 6 8or
3 60 9 12Not
2
0
4
6
8
10
12
5
10
15
20
25
30
10
20
30
40
50
60
00
19. Henry R. Kang (7/2008)
Graphing Rules – Table of Data
•A table of data may be
provided (This rule is not
always obeyed)
No individual coordinates of
data points should appear on
the graph.
20. Henry R. Kang (7/2008)
Example
0
1
2
3
4
5
6
7
8
9
10
16 16.5 17 17.5 18 18.5 19
Temperature (C)
Volume(mL)
T(°c) V
(mL)
17.0 6.58
16.8 5.76
17.8 7.71
18.5 8.84
18.2 8.47
17.5 7.04
16.5 5.25
(17.0,
6.58)
21. Henry R. Kang (7/2008)
Graphing Rules - Resolution
• Ideally, one should be able to read
all significant figures of the data
from its position on the graph
paper
Often, the significant figure of the
data is higher than the resolution of
the graph paper.
22. Henry R. Kang (7/2008)
Graphing Rules – Data Symbols
• A data point should be circled (or other
shapes such as square or triangle)
surrounding it.
• If more than one data set, each set
should have its own shape (or symbol)
to represent the data points.
You can use different color for different
data set, if color display or print is
available.
23. Henry R. Kang (7/2008)
Example
0
2
4
6
8
10
12
14
0 5 10 15
X
Y
Series1
Series2
Series3
24. Henry R. Kang (7/2008)
Graphing Rules - Drawing
• The curve or straight line is drawn smoothly
among the points, rather than connecting dots
(piece-wise linearization).
The curve or line should represent the best average of
the data.
Roughly about equal number of points above and below the
curve or line.
The curve or line does not have to touch any of the data
points.
Use a clear straightedge for drawing lines.
Use French curves for drawing curves.
25. Henry R. Kang (7/2008)
Incorrect Way of Line Drawing
0
1
2
3
4
5
6
7
8
9
10
16 16.5 17 17.5 18 18.5 19
Tem perature (C)
Volume(mL)
26. Henry R. Kang (7/2008)
Correct Way of Line Drawing
0
1
2
3
4
5
6
7
8
9
10
16 16.5 17 17.5 18 18.5 19
Tem perature (C)
Volume(mL)
27. Henry R. Kang (7/2008)
Graphing Rules – Data Points
•Data point should not exist
on the axis line.
This rule is not always obeyed.
You still plot the point if it lies
on the axis line.
29. Henry R. Kang (7/2008)
Linear Regression - Equation
• Linear regression is used to find the
straight line that fits the data best.
• General equation for a line is
Y = m X + b
X is the independent variable,
Y is the dependent variable,
m is the slope of the line, and
b is the y-axis intercept
30. Henry R. Kang (7/2008)
Linear Regression - Deviation
• Let yn = the observed values
• and ýn= the calculated value from the linear
equation
• The deviation
dn = ýn– yn (n = 1, 2, 3, - - -, N)
N is the number of data sets.
• The best result is obtained by minimizing the
deviation (or the square of the deviation)
∑ dn
2
= (y1–ý1)2
+ (y2–ý2)2
+ - - - + (yN–ýN)
31. Henry R. Kang (7/2008)
Linear Regression – Minimize Deviation
• Calculate ýn from the linear equation, we have
ýn = m xn + b
• The deviation becomes
dn = ýn– yn = m xn+ b – yn
• The square of deviation is
dn
2
= (m xn+ b – yn )2
= m2
xn
2
+ b2
+ yn
2
+ 2mbxn – 2mxnyn – 2byn
• The overall deviation is
∑ dn
2
= ∑ (m xn+ b – yn )2
∀ ∑dn
2
can be minimized by taking the partial derivatives
with respect to m and b, respectively.
32. Henry R. Kang (7/2008)
Linear Regression - Formulas
• Minimize ∑dn
2
by taking the partial derivatives
with respect to m (slope) and b (intercept)
∂(∑ dn
2
)/ ∂m = ∑(2mxn
2
+ 2bxn – 2xnyn) = 0
∂(∑ dn
2
)/ ∂b = ∑(2b + 2mxn – 2yn) = 0
• This set of equations can be solved for m and b.
N (∑xnyn ) – (∑xn) (∑ yn)
N (∑xn
2
) – (∑xn)2
N (∑xn
2
) – (∑xn)2
(∑xn
2
)(∑yn) – (∑xn)(∑xn yn)
m =
b =
where N is the number of data sets
33. Henry R. Kang (7/2008)
Linear Regression - Example
N (∑xn yn)– (∑xn) (∑ yn)
[N(∑xn
2
) – (∑xn)2
]1/2
[N(∑yn
2
)– (∑yn )2
]1/2r =
x
(C)
y
(liter)
xy x2
y2
1 0.0 20.0 0.0 0.0 400.
2 10.0 22.0 220. 100. 484.
3 20.0 23.0 460. 400. 529.
Sum 30.0
(∑xn)
65.0
(∑ yn)
680.
(∑xn yn)
500.
(∑xn
2
)
1413
(∑yn
2
)
b =
m =
N (∑xn
2
) – (∑xn)2
N (∑xn yn)– (∑xn) (∑ yn)
=
3×680.– 30.0×65.0
3×500.– 30.02
=
2040.– 1950.
1500.– 900.
=
90.0
600.
=0.150
N (∑xn
2
) – (∑xn)2
(∑xn
2
)(∑ yn) – (∑xn)(∑xn yn)
=
500.×65.0 – 30.0×680.
3×500.– 30.02
=
12100.
600.
= 20.2
=
=
3×680.– 30.0×65.0
(3×500.– 30.02
)1/2
(3×1413– 65.02
)1/2
90.0
r =
(600.)1/2
(14)1/2
0.982
r2
= 0.964
34. Henry R. Kang (7/2008)
Linear Regression - Goodness
• The “goodness” of the data fitting is expressed by
the regression coefficient r2
• If r2
= 1, perfect fit
• If r2
> 0.95, excellent fit
• If r2
> 0.90, good fit
• If r2
> 0.80, reasonable fit
• If r2
= 0, completely unrelated
N (∑xn yn)– (∑xn) (∑ yn)
[N (∑xn
2
) – (∑xn)2
]1/2
[N (∑yn
2
) – (∑yn )2
]1/2
r =
35. Henry R. Kang (7/2008)
Draw the Least-Square Line
• Once the slope m and intercept b are calculated from a
set of data (x and y), the best line can be drawn to fit the
data.
• A line, y = mx + b, is defined by two points.
• The least computational cost to find these two points is
Set x = 0, then y = b, giving the first point (0, b)
Set y = 0, then x = -b/m, giving the second point (-b/m, 0)
Put these two points in the graph, then draw a straight line
connecting these two points.
This line is the least-square line to fit the given data set (x and y) best
with a minimum error between the calculated and measured y values.