#Data Visualization #algorithm #Infographic
Have you ever wonder how Excel sets the upper limit and the lower limit on the vertical axis of a chart? And how this may lead to a misleading chart?
In my own case, I have not, until one day I found an obvious mistake on Excel’s dual vertical axes chart.
The mistake is resulted from that Excel does not have an algorithm that can address the most important and inevitable question for dual vertical axes charts: “How to set the upper limits and the lower limits on the TWO vertical axes?”. In fact, Excel simply adopts the same algorithm used for its single vertical axis chart on each vertical axis separately. And thus the elongations of the two axes are not coordinated to be the same, which leads to its misleading dual vertical axes charts.
To solve this critical mistake, Graphician invented a patented algorithm that can create 100% correct dual vertical axes chart. And we have also created a trial Excel Add-in which can adjust any dual vertical axes chart created by Excel 2007 or an advanced version with one single click.
You can now download the Add-in at http://www.graphician.com/patent-01.html. We hope you find the Add-in interesting and useful, and we would love to hear your comment about it if any. You may contact us at graphician1122@gmail.com or visit our website: "www.graphician.com" to find more information.
Similar to Have you ever wonder how Excel sets the upper limit and the lower limit on the vertical axis of a chart? And how this may lead to a misleading chart?
Similar to Have you ever wonder how Excel sets the upper limit and the lower limit on the vertical axis of a chart? And how this may lead to a misleading chart? (20)
The mythical technical debt. (Brooke, please, forgive me)
Have you ever wonder how Excel sets the upper limit and the lower limit on the vertical axis of a chart? And how this may lead to a misleading chart?
1. Dual vertical axes chart scaling algorithm
comparison with Excel’s
May 2016
Confidential
Questions/Comments?
Please contact Jennifer Lin
at graphician1122@gmail.com
1
2. ®
100
36
100
84
36
52
68
84
100
36
52
68
84
100
Q1 Q2
A B
100
36
100
84
75
80
85
90
95
100
105
0
20
40
60
80
100
120
Q1 Q2
A B
Executive Summary
2
Graphician is pleased to present one of our 5 US-patented algorithms.
An algorithm to truthfully present the intelligence of data graphically on dual vertical axes chart.
An algorithm can be easily incorporated into conventional tableted data applications such as Excel.
-16-64
Excel algorithm mispresents a decrease from 100 to 36 and
a decrease from 100 to 84 is the same.
Graphician algorithmExcel dual vertical axes chart
-16
-64
Graphician algorithm shows a decrease from 100 to 36 is
more than a decrease from 100 to 84.
3. ®
-91
100
94
90
92
94
96
98
100
102
-150
-100
-50
0
50
100
150
Q1 Q2
A B
What algorithm Excel adopts for dual axes chart now?
3
Excel single vertical axis chart Excel dual vertical axes chart
-6-191
Excel adopts the same algorithm for single vertical axis chart when setting the scales of dual vertical axes chart,
thus the elongations of both axes are not coordinated to be the same.
100100
-91
-150
-100
-50
0
50
100
150
Q1 Q2
A
5. ®
No negative base value
Commonly used “Base Value” method misleads too
5
Graphician algorithm is the only solution which can correctly present the interaction/relationship between the data sets in all
kinds of situations on chart.
Base Value
Period A B
Q1 100 20
Q2 20 100
Change -80 +80
100%
20%
100%
500%
0%
100%
200%
300%
400%
500%
600%
Q1 Q2
A B
Line A’s decrease should be equal to
Line B’s increase
100
2020
100
20
40
60
80
100
20
40
60
80
100
Q1 Q2
A B
With negative base value Base Value
100%
-80%
100%
-300%
-400%
-300%
-200%
-100%
0%
100%
200%
Q1 Q2
A B
Line A’s decrease should be more
than Line B’s decrease
100
-80
-20
-100 -100
-80
-60
-40
-20
0
20
40
60
80
-80
-60
-40
-20
0
20
40
60
80
100
Q1 Q2
A B
Period A B
Q1 100 -20
Q2 -80 -100
Change -180 -80
Graphician
Graphician
6. ®
6
Misled by chart (case 1):
What drove the increase of sales?
Period Selling Price Units Sold Sales
Q1 84 9,762 820,000
Q2 100 10,000 1,000,000
Excel
Excel algorithm presents as the selling price and units sold both increased, but there was no much increase on sales.
820,000
1,000,000
84
100
75
80
85
90
95
100
105
-
200,000
400,000
600,000
800,000
1,000,000
1,200,000
Q1 Q2
Sales Selling Price
Excel
820,000
1,000,000
9,762
10,000
9,600
9,650
9,700
9,750
9,800
9,850
9,900
9,950
10,000
10,050
-
200,000
400,000
600,000
800,000
1,000,000
1,200,000
Q1 Q2
Sales Units Sold
7. ®
7
Misled by chart (case 1):
What drove the increase of sales? (cont.)
Period Selling Price Units Sold Sales
Q1 84 9,762 820,000
Q2 100 10,000 1,000,000
Graphician
820,000
1,000,000
9,762
10,000
8,200
8,560
8,920
9,280
9,640
10,000
820,000
856,000
892,000
928,000
964,000
1,000,000
Q1 Q2
Sales Units Sold
Graphician
820,000
1,000,000
84
100
82.0
85.6
89.2
92.8
96.4
100.0
820,000
856,000
892,000
928,000
964,000
1,000,000
Q1 Q2
Sales Selling Price
Graphician algorithm presents the fact that the main driver of increased sales is the increased selling price.
9. ®
Misled by chart (case 2):
What drove the growth of number of employed labor?
9
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
110,000
115,000
120,000
125,000
130,000
135,000
140,000
145,000
150,000
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Employed in all industries
Employed in non-agricultural industries
Excel Graphician auto scaling algorithm
119,651
124,511
129,371
134,232
139,092
143,952
121,392
126,323
131,254
136,185
141,116
146,047
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Employed in all industries
Employed in non-agricultural industries
Graphician
18.7%
20.3%
18.7% 20.3%
Graphician shows the fact that increase of employed in all
industries was mainly contributed by increase of
employed in non-agricultural industries in modern society.
Excel shows there was few relationship between the 2
data sets.
Source: U.S. Bureau of Labor Statistics
10. ®
Misled by chart (case 3):
Which stock performed better?
10
Source: Stock price data base
30.53
31.76
32.98
34.21
35.43
36.66
24.15
25.12
26.09
27.06
28.03
29.00
3/23/2004
4/23/2004
5/23/2004
6/23/2004
7/23/2004
8/23/2004
9/23/2004
10/23/2004
Microsoft Dell
31.00
32.00
33.00
34.00
35.00
36.00
37.00
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
3/23/2004
4/23/2004
5/23/2004
6/23/2004
7/23/2004
8/23/2004
9/23/2004
10/23/2004
Microsoft Dell
Excel Graphician auto scaling algorithmGraphician
20.1%
11.1%
20.1%
11.1%
Graphician shows the movement of the 2 stocks in same
elongation and the fact that Microsoft’s share price
performed better than Dell’s.
Microsoft’s share price grew 20.1% while Dell’s grew only
11.1%. However, the chart indicated that Dell’s price
movement was much larger than Microsoft’s.
11. ®
100
36
1000
840
360
520
680
840
1000
36
52
68
84
100
Q1 Q2
A B
100
36
1000
840
750
800
850
900
950
1000
1050
0
20
40
60
80
100
120
Q1 Q2
A B
Key steps of the algorithm
11
A B
Q1 100 1000
Q2 36 840
Original E-value
0.64
= (100-36)/100
0.16
= (1000-840)/1000
Upper limit
of the axis
100
= A’s Max
1000
= B’s Max
Lower limit
of the axis
36
= A’s Min
360
= B’s Max ×
A’s Min/A’s Max
New E-value N/A
0.64
= (1000-360)/1000
Note: (1) Which of the upper and lower limit should be unchanged and how to calculate the other limit is disclosed in the flowchart next page.
Calculate the E-value of each sequence (A: 0.64; B:0.16)
Set upper and lower limits of the axis with larger E-value (A: 0.64)
as its Max & Min (100 & 36)
Set one of the upper and lower limit of the axis with smaller E-
value (B: 0.16) unchanged (1000) (1)
Calculate the other limit of the axis with smaller E-value (360).
B’ new E-value (0.64) equals to A’s original E-value (0.64) (1)
4 key steps
1
2
3
4
1
2 3
4
1
2
4
2
2
3
4
-16%
-64%-16%-64%
Excel algorithm Graphician algorithm
12. ®
Step 3 & 4 of the algorithm:
Which of the upper and lower limit should be changed and how?
│Max value of 1st data set │
≥
│Min value of 1st data set │
│Max value of 2nd data set │
≥
│Min value of 2nd data set │
Adjust lower limit
of 2nd data set’s axis
= 2nd data set max value
× 1st data set mini value
÷ 1st data set max value
│Max value of 2nd data set │
≥
│Min value of 2nd data set │
Yes
Yes Yes
No
No No
Adjust upper limit
of 2nd data set’s axis
= 2nd data set mini value
× 1st data set mini value
÷ 1st data set max value
Adjust lower limit
of 2nd data set’s axis
= 2nd data set max value
× 1st data set max value
÷ 1st data set mini value
Adjust upper limit
of 2nd data set’s axis
= 2nd data set mini value
× 1st data set max value
÷ 1st data set mini value
i. 1st data set refers to the data set
with larger E-Value
ii. 2nd data set refers to the data set
with smaller E-Value
Upper limit of 2nd data
set’s axis unchanged
= 2nd data set max value
Upper limit of 2nd data
set’s axis unchanged
= 2nd data set max value
Lower limit of 2nd data
set’s axis unchanged
= 2nd data set mini value
Lower limit of 2nd data
set’s axis unchanged
= 2nd data set mini value
12
13. ®
Step 3 & 4 of the algorithm (cont.):
Yes
Yes 100
36
1000
840
360
520
680
840
1000
36
52
68
84
100
Q1 Q2
A B
A B
Q1 100 1000
Q2 36 840
Original E-value
0.64
= (100-36)/100
0.16
= (1000-840)/1000
Upper limit
of the axis
100
= A’s Max
1000
= B’s Max
Lower limit
of the axis
36
= A’s Min
360
= B’s Max ×
A’s Min/A’s Max
New E-value N/A
0.64
= (1000-360)/1000
1
2 3
4
1
2
4
2
2
3
4
-16%
-64%
Graphician algorithm
│100│≥│36│
│1000│≥│840│
Adjust lower limit
of 2nd data set’s axis
= 1000 × 36 ÷ 100
=360
Upper limit of 2nd data
set’s axis unchanged
= 1000
i. Here 1st data set is data set A
ii. Here 2nd data set is data set B
13
15. ®
Demonstration of all kinds of situations
15
We define:
a1 = Max value of sequence (A);
an = Min value of sequence (A)
To prove that the patented algorithm can present the true interaction of data with the same elongation ratio under all kinds of
situations, we will demonstrate one example for each situation.
Though the algorithm can be applied to charts with multiple vertical axes, to simplify the demonstration, we assume there
are only two sets of sequences: sequence (A) and sequence (B). Each sequence has only two data, 1st data and 2nd data.
Note: The case of “Max = Min” is not included as there is special treatment as disclosed in the patent.
There are total 16 (=4*4) combinations crossed sequence (A) and (B).
We define:
b1 = Max value of sequence (B);
bn = Min value of sequence (B)
A1: a1 ≥ 0 an ≥ 0 │a1│>│an│
A2: a1 > 0 an < 0 │a1│>│an│
A3: a1 ≥ 0 an < 0 │a1│<│an│
A4: a1 < 0 an < 0 │a1│<│an│
B1: b1 ≥ 0 bn ≥ 0 │b1│>│bn│
B2: b1 > 0 bn < 0 │b1│>│bn│
B3: b1 ≥ 0 bn < 0 │b1│<│bn│
B4: b1 < 0 bn < 0 │b1│<│bn│
For any sequence of data, the range of the Max value and
the Min value can only be one of the 4 situations:
1: Max ≥ 0 Min ≥ 0 │Max│>│Min│
2: Max > 0 Min < 0 │Max│>│Min│
3: Max ≥ 0 Min < 0 │Max│<│Min│
4: Max < 0 Min< 0 │Max│<│Min│