SlideShare a Scribd company logo
1 of 255
Download to read offline
Encapsulation and Abstraction for
Modeling and Visualizing
Information Uncertainty
Alexander Streit
Bachelor of Information Technology (Honours)
Queensland University of Technology
A thesis submitted in partial fulfilment of the requirements for the degree of
Doctor of Philosophy
November 2007
Principal Supervisor: Prof. Binh Pham
Associate Supervisor: Dr. Ross Brown
Faculty of Information Technology
Queensland University of Technology
Brisbane, Queensland, AUSTRALIA
© Copyright by Alexander Streit 2007
All Rights Reserved
ii
Dedication
For Jasper, Fral, and Jilli.
iii
iv
Keywords
Information Uncertainty Visualization, Information Uncertainty Modeling, Spread-
sheets, Visualization Spreadsheets, Uncertainty Visualization Spreadsheets, Visualiza-
tion Tools, Modeling Tools, Uncertainty Modeling, Uncertainty Visualization, Proba-
bility, Fuzzy Visualization, Visualization Frameworks, Visualization
v
vi
Abstract
Information uncertainty is inherent in many real-world problems and adds a layer of
complexity to modeling and visualization tasks. This often causes users to ignore
uncertainty, especially when it comes to visualization, thereby discarding valuable
knowledge. A coherent framework for the modeling and visualization of information
uncertainty is needed to address this issue
In this work, we have identified four major barriers to the uptake of uncertainty
modeling and visualization. Firstly, there are numerous uncertainty modeling tech-
niques and users are required to anticipate their uncertainty needs before building their
data model. Secondly, parameters of uncertainty tend to be treated at the same level
as variables making it easy to introduce avoidable errors. This causes the uncertainty
technique to dictate the structure of the data model. Thirdly, propagation of uncertainty
information must be manually managed. This requires user expertise, is error prone,
and can be tedious. Finally, uncertainty visualization techniques tend to be developed
for particular uncertainty types, making them largely incompatible with other forms
of uncertainty information. This narrows the choice of visualization techniques and
results in a tendency for ad hoc uncertainty visualization.
The aim of this thesis is to present an integrated information uncertainty modeling
vii
and visualization environment that has the following main features: information and
its uncertainty are encapsulated into atomic variables, the propagation of uncertainty is
automated, and visual mappings are abstracted from the uncertainty information data
type.
Spreadsheets have previously been shown to be well suited as an approach to visu-
alization. In this thesis, we devise a new paradigm extending the traditional spreadsheet
to intrinsically support information uncertainty.
Our approach is to design a framework that integrates uncertainty modeling tech-
niques into a hierarchical order based on levels of detail. The uncertainty information
is encapsulated and treated as a unit allowing users to think of their data model in terms
of the variables instead of the uncertainty details. The system is intrinsically aware of
the encapsulated uncertainty and is therefore able to automatically select appropriate
uncertainty propagation methods.
A user-objectives based approach to uncertainty visualization is developed to guide
the visual mapping of abstracted uncertainty information. Two main abstractions of
uncertainty information are explored for the purpose of visual mapping: the Unified
Uncertainty Model and the Dual Uncertainty Model. The Unified Uncertainty Model
provides a single view of uncertainty for visual mapping, whereas the Dual Uncertainty
Model distinguishes between possibilistic and probabilistic views. Such abstractions
provide a buffer between the visual mappings and the uncertainty type of the underly-
ing data, enabling the user to change the uncertainty detail without causing the visual-
ization to fail.
Two main case studies are presented. The first case study covers exploratory
and forecasting tasks in a business planning context. The second case study inves-
tigates sensitivity analysis for financial decision support. Two minor case studies are
also included: one to investigate the relevancy visualization objective applied to busi-
ness process specifications, and the second to explore the extensibility of the system
through General Purpose Graphics Processor Unit (GPGPU) use. A quantitative anal-
ysis compares our approach to traditional analytical and numerical spreadsheet-based
viii
approaches. Two surveys were conducted to gain feedback on the from potential users.
The significance of this work is that we reduce barriers to uncertainty modeling
and visualization in three ways. Users do not need a mathematical understanding of
the uncertainty modeling technique to use it; uncertainty information is easily added,
changed, or removed at any stage of the process; and uncertainty visualizations can be
built independently of the uncertainty modeling technique.
ix
x
Publications
1. Pham, B. and Streit, A. and Brown, R. “Visualisation of Information Uncertainty: Progress and
Challenges,” in Interactive Visualisation: A State-of-the-Art Survey, Elena Zudilova-Seinstra,
Tony Adriaansen and Robert van Liere (eds.), 2007, Springer, UK. In Print.
2. Streit, A. and Pham, B. and Brown, R. “A Spreadsheet Approach to Facilitate Visualization of
Uncertainty in Information,” IEEE Transactions on Visualization and Computer Graphics, 11
July 2007. IEEE Computer Society Digital Library. IEEE Computer Society, 30 September
2007 <http://doi.ieeecomputersociety.org/10.1109/TVCG.2007.70426>
3. Streit, A. and Pham, B. and Brown, R. Visualisation Support for Managing Large Business Pro-
cess Specifications. International Conference on Business Process Management (BPM). Nancy,
France, September 6-8, 2005. Lecture Notes in Computer Science, Springer. Acceptance rate:
13%
4. Campbell, A. and Berglund, E. and Streit, A. Graphics Hardware Implementationof the Parameter-
Less Self-Organising Map. International Conference on Intelligent Data Engineering and Au-
tomated Learning (IDEAL’05). Brisbane, July 6-8, 2005. Pages 343-350. Lecture Notes in
Computer Science, Springer.
xi
xii
Acknowledgments
This thesis would not have been possible without my principal supervisor, Prof. Binh
Pham, and my associate supervisor, Dr. Ross Brown. Both collaborated to teach me
their process for completing research projects. Invaluable knowledge for which I am
very grateful.
I wish especially to thank Fral, who supported me even when it didn’t seem ratio-
nal to do so. My mother, Jilli, who should really be receiving this degree herself. I
also wish to thank my Honors supervisor, Ruth Christie, who inspired me to pursue
postgraduate studies in the first instance.
I wish to thank my colleague, Dr. Robert Smith, who provided me with extensive
insight and feedback. Alexander Campbell for his many comments and suggestions.
Finally, I wish to thank my business associate, Dr. Andy Boud, for acting as an unof-
ficial mentor.
xiii
xiv
Abbreviations
ASP Analytical Spreadsheet Package
BPM Business Process Management
DUM Dual Uncertainty Model
EBNF Extended Backus-Naur Form
GIS Geographic Information Systems
GPGPU General Purpose Graphics Processing Unit
LIC Line Integral Convolution
NaN Not A Number
NIST National Institute of Standards and Technology
NPV Net Present Value
PDF Probability Density Function
PMF Probability Mass Function
QUM Quad Uncertainty Model
SI The Spreadsheet for Images
SIV Spreadsheet for Information Visualization
xv
UML Unified Modeling Language
UUM Unified Uncertainty Model
VTK The Visualization Toolkit
xvi
Contents
Abstract vii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Original Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 7
2 Background 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Information Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Sources of Information Uncertainty . . . . . . . . . . . . . . 12
2.2.2 Understanding Information Uncertainty . . . . . . . . . . . . 13
2.2.3 Approaches to Modeling Information Uncertainty . . . . . . . 15
2.3 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 The Sensemaking Process . . . . . . . . . . . . . . . . . . . 21
2.3.2 Visualization Techniques . . . . . . . . . . . . . . . . . . . . 24
2.4 Information Uncertainty Visualization Approaches . . . . . . . . . . 29
2.4.1 Low-level Features . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Higher-level Constructions . . . . . . . . . . . . . . . . . . . 34
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
xvii
3 Framework for Integrated Uncertainty Modeling and Visualization 39
3.1 A New Approach to Information Uncertainty . . . . . . . . . . . . . 39
3.2 Analysis of Issues and Requirements . . . . . . . . . . . . . . . . . . 41
3.2.1 Ad hoc Visualization Techniques . . . . . . . . . . . . . . . . 42
3.2.2 Incoherence of Uncertainty Models . . . . . . . . . . . . . . 45
3.2.3 Artificial Separation of Information and Uncertainty . . . . . 47
3.3 Components of the Framework . . . . . . . . . . . . . . . . . . . . . 48
3.3.1 Spreadsheet Paradigm . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 Uncertainty Encapsulation . . . . . . . . . . . . . . . . . . . 50
3.3.3 Uncertainty Abstraction . . . . . . . . . . . . . . . . . . . . 51
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Spreadsheet Paradigm for Information Uncertainty 53
4.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Related Work on Spreadsheets . . . . . . . . . . . . . . . . . . . . . 55
4.3 Architecture and Features . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.1 Uncertainty Encapsulation . . . . . . . . . . . . . . . . . . . 57
4.3.2 Uncertainty Abstraction . . . . . . . . . . . . . . . . . . . . 60
4.4 New Process and Workflow . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 Capabilities and Advantages . . . . . . . . . . . . . . . . . . . . . . 63
4.6 Case Study: Financial Decision Support . . . . . . . . . . . . . . . . 65
5 Uncertainty Encapsulation and Automated Propagation 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Unified Information Uncertainty Framework . . . . . . . . . . . . . . 72
5.2.1 Conceptualizing Information Uncertainty and its Usage . . . . 73
5.2.2 Categorization of Uncertainty Models . . . . . . . . . . . . . 77
5.2.3 Data Structures for Information Uncertainty . . . . . . . . . . 81
5.3 Automated Propagation of Information Uncertainty . . . . . . . . . . 86
5.3.1 Uncertainty Propagation Model . . . . . . . . . . . . . . . . 86
5.3.2 Hierarchical Heterogeneous Propagation . . . . . . . . . . . 88
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6 Uncertainty Abstraction for Visualization 95
6.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . 95
6.2 User-objectives for Information Uncertainty Visualization . . . . . . . 97
6.2.1 Analysis of User-Objectives . . . . . . . . . . . . . . . . . . 98
6.2.2 A Computer Assisted User-Objectives Selection Method . . . 101
6.3 Uncertainty Abstraction Models . . . . . . . . . . . . . . . . . . . . 103
xviii
6.3.1 The Unified Uncertainty Model . . . . . . . . . . . . . . . . 104
6.3.2 The Dual Uncertainty Model . . . . . . . . . . . . . . . . . . 105
6.3.3 Design and Use . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3.4 Alternative Models . . . . . . . . . . . . . . . . . . . . . . . 111
6.4 Case Study: User-Objectives in Financial Decision Support . . . . . . 113
6.5 Case Study: Relevancy Objective in Business Process Management . 123
7 Integration of Core Features 133
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3.2 Core Components . . . . . . . . . . . . . . . . . . . . . . . 138
7.3.3 Plugin Components . . . . . . . . . . . . . . . . . . . . . . . 147
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8 Advanced Features and Extensibility 151
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.2 Advanced Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.2.1 Hierarchical Spreadsheets . . . . . . . . . . . . . . . . . . . 152
8.2.2 Floating Observers and Embedded Visualizations . . . . . . . 157
8.2.3 Customization . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.3 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.4 Case Study: GPGPUSheet . . . . . . . . . . . . . . . . . . . . . . . 167
9 Evaluation 171
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.2 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.2.1 Construction Experiments . . . . . . . . . . . . . . . . . . . 175
9.2.2 Retrospection Experiments . . . . . . . . . . . . . . . . . . . 180
9.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.3 Sensitivity Analysis Surveys . . . . . . . . . . . . . . . . . . . . . . 185
9.3.1 First Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.3.2 Second Survey . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.4 Case Study: Business Planning . . . . . . . . . . . . . . . . . . . . . 191
10 Conclusion and Future Work 199
10.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
xix
10.3 Possible Applications and Extensions . . . . . . . . . . . . . . . . . 201
Bibliography 203
A First Survey 215
B Second Survey 219
xx
List of Figures
2.1 Fuzzy Set for Hot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Fuzzification for Temperature . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Results of Fuzzy Operations are Shown by the Grey Shaded Regions . 19
2.4 Defuzzification Using an α-cut . . . . . . . . . . . . . . . . . . . . . 20
2.5 Example Rough Set for Containment of a Region . . . . . . . . . . . 21
2.6 The Multi-agent Visualization Support System . . . . . . . . . . . . . 23
2.7 The Visualization Task Network (VTN) Learns Task-oriented Visual-
ization Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8 An Ontology of Visualization . . . . . . . . . . . . . . . . . . . . . . 24
2.9 Visualization Techniques Categorized by the Type of Data to be Visu-
alized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.10 Examples of Selected Visualization Techniques . . . . . . . . . . . . 26
2.11 The Model-based Visualization Taxonomy . . . . . . . . . . . . . . . 29
2.12 Relationship between Uncertainty Visualization, Information Uncer-
tainty Visualization, Error Visualization, and Fuzzy Visualization . . . 30
2.13 Using Opacity to Show the Structure of Uncertainty. Color Scheme
(left), Normal Rendering (center), Uncertainty Structure (right) . . . . 32
2.14 Visual Mappings Showing Difference. From Left to Right: Overlay,
Rainbow Mapping, White-black-white Pseudo-coloring, Glyph (Hi-
pass), Glyph (low-pass) . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.15 Tip Level Based on the Quality of the Food and Service Using Fuzzy
Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.16 Two Frames from a Luminosity Oscillation Animation . . . . . . . . 37
xxi
2.17 Visualization With Probability Density Functions Over Associated Data
Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Visualizations of Employment Numbers in California. Years 2005-
2010 are Predicted. (a) Assuming Average Growth (b) Indicating Growth
is Estimated (c) Possible Growth (d) Likely Growth. (Data Source:
California Employment Development Department) . . . . . . . . . . 44
3.2 Description of the Framework . . . . . . . . . . . . . . . . . . . . . 48
3.3 Additional Layer in Spreadsheet Hierarchy . . . . . . . . . . . . . . 49
3.4 Screenshot of the Prototype System . . . . . . . . . . . . . . . . . . 50
4.1 Basic Cell Type Object Hierarchy . . . . . . . . . . . . . . . . . . . 57
4.2 Novel CellType Object Hierarchy . . . . . . . . . . . . . . . . . . . 58
4.3 Screen-shot of the Prototype . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Visualization Sheet for the Graph in Figure 4.7 . . . . . . . . . . . . 61
4.5 Process for Constructing an Uncertainty Spreadsheet . . . . . . . . . 62
4.6 Interval Modeling Example: (a) Original Model (b) Traditional Spread-
sheet (c) Prototype System Uncertainty Hidden (d) Prototype System
Uncertainty Shown . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.7 Using An Interval (±0.5) for Annual Change in Interest Rates Propa-
gates the Uncertainty to NPV . . . . . . . . . . . . . . . . . . . . . . 69
4.8 Volumetric Representation of the Most Likely Effect Interest Rate Changes
Will Have on NPV. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1 Progression Through States of Information Uncertainty (Boxes) as a
Result of Information (Arrows) . . . . . . . . . . . . . . . . . . . . . 74
5.2 Projection of Information Uncertainty onto an Estimate Point . . . . . 75
5.3 All Collapses of the Interval [4.5,5.5] . . . . . . . . . . . . . . . . . 76
5.4 All Collapses of a Fuzzy Number Around 5 . . . . . . . . . . . . . . 76
5.5 Collapses for a Fuzzy Number Around 5, Using a Cut Plane . . . . . 77
5.6 The Interval Data Structure . . . . . . . . . . . . . . . . . . . . . . . 83
5.7 Definition of Continuous Rough Set Using a Marker Sequence . . . . 84
5.8 Definition of Linearly Defined Fuzzy Set Using a Marker Sequence . 85
5.9 Information Uncertainty Modeling Techniques Sorted into Three Strata 89
5.10 Example of Increasing Levels of Uncertainty Information . . . . . . . 90
5.11 Sample Promotion/Demotion Graph . . . . . . . . . . . . . . . . . . 92
6.1 Graphs illustrating the Visual Treatment of Information with Variable
Degrees of Uncertainty under Different Objectives. . . . . . . . . . . 98
6.2 Schematic Illustration of the Dual Uncertainty Model . . . . . . . . . 106
xxii
6.3 UML Diagram of the Dual Uncertainty Model . . . . . . . . . . . . . 109
6.4 Illustration of the Quad Uncertainty Model . . . . . . . . . . . . . . . 111
6.5 Example of a Recursive UUM . . . . . . . . . . . . . . . . . . . . . 112
6.6 Possible Effects of House Price Movements on NPV (2D). . . . . . . 115
6.7 Possible Effects of House Price Movements on NPV (3D). . . . . . . 116
6.8 Possible Effects of House Prices and Interest Rates on NPV. . . . . . 117
6.9 Most Likely Profitability Resulting from Changes in Interest Rates . . 118
6.10 Volumetric Representation of the Most Likely Effect Interest Rate Changes
Will Have on NPV. . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.11 Likelihood of NPV . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.12 Effect of Interest Rate Changes Grouped into 5 Year Periods . . . . . 121
6.13 Optimum Time to Sell the Property Under Different Economic Condi-
tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.14 Architecture of the Case Study System . . . . . . . . . . . . . . . . . 124
6.15 YAWL query: Prototype Tool for the Graphical Business Specification
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.16 Production Rules for Reducing a YAWL Graph . . . . . . . . . . . . 127
6.17 Original Graph Prior to Simplification . . . . . . . . . . . . . . . . . 130
6.18 Reduced Specification using Collapse Approach (α = 2.5) . . . . . . 131
6.19 Reduced Specification for Text Query “legal” Using Decimation Ap-
proach (β = 0.5, α = 1) . . . . . . . . . . . . . . . . . . . . . . . . 131
7.1 Design of the User Interface . . . . . . . . . . . . . . . . . . . . . . 135
7.2 The Spreadsheet Architecture . . . . . . . . . . . . . . . . . . . . . . 135
7.3 High-level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.4 View and Controller Classes for the Spreadsheet . . . . . . . . . . . . 138
7.5 The Core Components . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.6 UML Inheritance Diagram for the Kernel Class . . . . . . . . . . . . 139
7.7 Main Classes in the Datamodel Component . . . . . . . . . . . . . . 141
7.8 Relationship of the Dependency Graph to Cells and CellContainers . . 141
7.9 Cell C is Dependent on A Multiple Times . . . . . . . . . . . . . . . 142
7.10 UML Inheritance Diagram for the DependencyGraph Class . . . . . . 143
7.11 Example Formula and its CodeTree . . . . . . . . . . . . . . . . . . 145
7.12 Formula Language Definition . . . . . . . . . . . . . . . . . . . . . . 146
7.13 UML Diagram for the CodeTree Class . . . . . . . . . . . . . . . . . 146
7.14 UML Inheritance Diagram for the Propagation Models . . . . . . . . 146
7.15 The IPropagationMethod Class . . . . . . . . . . . . . . . . . . . . . 147
7.16 The IPlugin Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.17 The ICellType Interface . . . . . . . . . . . . . . . . . . . . . . . . . 148
xxiii
7.18 The IDUMMethod Interface . . . . . . . . . . . . . . . . . . . . . . 149
7.19 The UncertaintyRange and UncertaintyRangeSet Classes . . . . . . . 149
7.20 UML Inheritance Diagram for the IVisualElement Interface . . . . . . 150
8.1 Hierarchical Spreadsheet Prototype. The Parent Sheet (Left) Contains
the Child Sheet (Right) . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.2 Floating Observer, Observing the Uncertainty Line Graph in Cell D14 158
8.3 Dependency Tree for the Floating Observer from Figure 8.2 . . . . . 159
8.4 CellType List Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.5 Propagation Model Editor . . . . . . . . . . . . . . . . . . . . . . . . 162
8.6 Propagation Model, Method, and Model Set . . . . . . . . . . . . . . 163
8.7 Propagation Model Set Editor . . . . . . . . . . . . . . . . . . . . . 163
8.8 Dual Uncertainty Model Selector . . . . . . . . . . . . . . . . . . . . 164
8.9 Prototype system for GPGPU visualization . . . . . . . . . . . . . . 168
9.1 Spreadsheet for First Experiment, Without Uncertainty . . . . . . . . 176
9.2 Spreadsheet for First Experiment, With Uncertainty . . . . . . . . . . 177
9.3 Spreadsheet for First Experiment, Analytical Approach Using Tradi-
tional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.4 Monte-Carlo Spreadsheet for First Experiment . . . . . . . . . . . . . 178
9.5 Construction Cost Graph . . . . . . . . . . . . . . . . . . . . . . . . 183
9.6 Number of Formulae in Construction Experiments . . . . . . . . . . . 184
9.7 Formula and Layout Changes During Retrospection Experiments . . . 185
9.8 Background of Respondents . . . . . . . . . . . . . . . . . . . . . . 189
9.9 Average Completion Time . . . . . . . . . . . . . . . . . . . . . . . 190
9.10 Questions and Average Responses . . . . . . . . . . . . . . . . . . . 191
9.11 Hierarchical Business Plan Spreadsheet . . . . . . . . . . . . . . . . 193
9.12 Market Share Overview with Embedded Sheets . . . . . . . . . . . . 193
9.13 Market Share for Year 1 . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.14 Target Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.15 Typical Break-even Visualization . . . . . . . . . . . . . . . . . . . . 196
9.16 Probabilistic Break-even Visualization . . . . . . . . . . . . . . . . . 197
xxiv
List of Tables
2.1 Sources and Causes of Information Uncertainty . . . . . . . . . . . . 13
2.2 Data Stages in the Data State Model . . . . . . . . . . . . . . . . . . 28
2.3 Transformation Operators in the Data State Model . . . . . . . . . . . 28
4.1 Format of Cells in the Prototype System . . . . . . . . . . . . . . . . 58
4.2 Prototype Uncertainty Interrogation Functions . . . . . . . . . . . . . 61
5.1 Predicted Growth Rates used in Figure 3.1 . . . . . . . . . . . . . . . 73
5.2 Categories of Information Uncertainty Modeling Techniques . . . . . 78
5.3 Common Information Uncertainty Modeling Types . . . . . . . . . . 82
6.1 Information Uncertainty Visualization Objectives . . . . . . . . . . . 97
6.2 Questions Used to Elicit the User-Objective . . . . . . . . . . . . . . 103
8.1 Comparison of IntervalCell and SpreadsheetCell . . . . . . . . . . . . 155
8.2 Examples from the Prototype Addressing Scheme . . . . . . . . . . . 155
8.3 Novel Cell Types for GPGPU . . . . . . . . . . . . . . . . . . . . . . 167
8.4 Novel Functions for GPGPU . . . . . . . . . . . . . . . . . . . . . . 169
9.1 A Selection of Normal Probability Functions in Microsoft Excel 2003 172
9.2 Actions of the User . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
9.3 Retrospection Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.4 Respondents to the Survey . . . . . . . . . . . . . . . . . . . . . . . 185
xxv
xxvi
Statement of Original Authorship
The work contained in this thesis has not been previously submitted for a
degree or diploma at any other higher education institution. To the best of
my knowledge and belief, the thesis contains no material previously pub-
lished or written by another person except where due reference is made.
Signature:
Alexander Streit
Date:
xxvii
xxviii
CHAPTER 1
Introduction
1.1 Motivation
The term information uncertainty refers to vagueness, imprecision, fuzziness, likeli-
hood, and related uncertainty as it is present in information. Many problems are subject
to information uncertainty and, in response, numerous techniques have been developed
to model this uncertainty. Modeling information uncertainty not only provides greater
confidence in results, but can also give an indication of how much confidence to place
in the results. While visualization is a popular tool, information uncertainty visualiza-
tion is far less widespread.
In this work we have identified four major barriers to the uptake of information
uncertainty modeling and visualization. Firstly, there are numerous information uncer-
tainty modeling techniques, each of which are treated differently. This forces users to
anticipate their information uncertainty needs before building their data model. Sec-
ondly, parameters of the uncertainty space tend to be treated at the same level as vari-
ables, which makes it easier to introduce avoidable errors and causes the information
2 Chapter 1. Introduction
uncertainty modeling technique to dictate the structure of the user’s model. Thirdly,
propagation of uncertainty information must be manually managed by the user, which
requires expertise, is error prone, and can be tedious. Fourthly, uncertainty visualiza-
tion techniques tend to be developed for particular information uncertainty types and
they are largely incompatible with other forms of uncertainty information. This nar-
rows the selection of visualization techniques available and results in a tendency for ad
hoc information uncertainty visualization techniques.
Information uncertainty modeling makes it more difficult to manage the data model
due to increased information. Furthermore, it is common that a chosen uncertainty
modeling technique will subsequently need to be changed, since knowledge about the
uncertainty changes as more information becomes available. This is currently a diffi-
cult and error prone process.
Visualization of information uncertainty poses its own unique challenges. Existing
visualization techniques may not be appropriate for uncertainty information and there
are issues with information overloading and interpretatability of results. On a practical
level, there is a lack of tools that are conducive to visualizing information uncertainty.
To ease the burden of managing the information overload in modeling and visual-
ization requires an integrated system that covers the entire workflow cycle from data
acquisition to visualization. Tools are also needed to help users with higher-level tasks
such as selection of modeling and propagation options, and organization and compar-
ison of visual mappings. More specifically, the architecture should support automated
uncertainty propagation and allow easy switching between different uncertainty mod-
els, and different methods of display.
Spreadsheets are often used to perform uncertainty based analysis and they have
previously been shown to be well suited as an approach to visualization. However, the
benefit of a spreadsheet approach to uncertainty modeling and visualization has not yet
been explored. This thesis extends the spreadsheet paradigm to support information
uncertainty modeling and visualization in an integrated whole.
1.2 Aims 3
1.2 Aims
The overall aim for this thesis is to devise an integrated information uncertainty mod-
eling and visualization environment that has the following features:
Hierarchical structure: The system should differentiate between levels of detail in
the data model. Uncertainty information is of a lower level of detail than the
variables.
Reduce data-type lock-in: If a data model is constructed using particular informa-
tion uncertainty modeling techniques, the cost to change to another modeling
technique should be minimized.
Adaptive: Information about the uncertainty space of a variable should be easy to add,
change, or remove at any stage of the modeling and visualization process.
Seamless integration of information and its uncertainty: There should not be an ar-
tificial separation between the information and its uncertainty.
Simplify information uncertainty modeling: Users should not be required to have
an intimate understanding of the modeling technique mechanics in order to use
it.
Automate propagation: Uncertainty information needs to be propagated and the sys-
tem should carry this out automatically.
Less error prone: The system should reduce the potential for user induced errors.
Flexible: Users should be able to map uncertainty information into alternative models
and visual features so that they can explore the impacts of different modeling
and visualization techniques.
Robust: When the uncertainty information changes, the existing data model and vi-
sualizations should continue to function correctly.
4 Chapter 1. Introduction
Extensible: There are numerous information uncertainty modeling techniques and the
design of the system should allow for more to be added.
In order to achieve this aim, the following tasks are performed:
• Examine the field to determine the current state of play, covering information
uncertainty modeling techniques, visualization processes and practices, and un-
certainty visualization;
• Design an integrated information uncertainty modeling and visualization frame-
work
• Investigate how the spreadsheet paradigm can be extended to intrinsically sup-
port information uncertainty modeling and visualization;
• Explore uncertainty encapsulation as an approach to semantic association of in-
formation and its uncertainty;
• Develop an automated propagation mechanism and a method for resolving un-
usual modeling technique combinations;
• Design uncertainty abstractions that enable visualization mappings to be data-
type independent;
• Explore the user-objectives approach as a means for defining visualization char-
acteristics;
• Conduct a quantitative analysis comparing the cost of our approach to existing
methods;
• Analyze feedback from potential users;
• Conduct case studies on financial decision support and business planning to es-
tablish the viability of the spreadsheet for commercial uses;
1.3 Scope 5
• Investigate of the capability for the architecture to be applied to non-uncertainty
uses through a case study; and
• Draw conclusions and make recommendations for future work.
1.3 Scope
This thesis deals with information uncertainty, which is uncertainty about the true value
of a unit of information. The intrinsic connection between uncertainty and information
is the basis for our encapsulation approach, which underpins the automatic propagation
and visualization-oriented uncertainty abstraction. However, there exist several other
forms of uncertainty, such as uncertainty arising from interpretation, for which the
encapsulation approach may not be suitable. The methods presented in this thesis is
limited to those forms of uncertainty that can be parametrized in some quantifiable
way.
Modeling of uncertainty has its foundation in mathematics. This project is con-
cerned with the frameworks, approaches, and methods for applying these modeling
techniques. As such, mathematical issues will be touched on, however, detailed cov-
erage of mathematical models is beyond the scope of this work and it is assumed that
users will use the mathematical techniques appropriate to their problem.
1.4 Original Contribution
Current investigations into information uncertainty visualization have focused on vi-
sualization techniques for particular information uncertainty data types. We approach
the problem of information uncertainty visualization holistically, from modeling and
automated propagation through to user-objectives in visualization.
We produce an integrated information uncertainty modeling and visualization frame-
work and design the information uncertainty visualization spreadsheet, which intrin-
sically support information uncertainty modeling, automated uncertainty propagation,
and uncertainty model abstracted visualization.
6 Chapter 1. Introduction
To achieve this we extend the spreadsheet paradigm to incorporate information un-
certainty and visualization features. This requires a number of components. Firstly,
our encapsulation of uncertainty information approach semantically links the infor-
mation to its uncertainty. Secondly, we introduce the uncertainty propagation model
to manage the mechanics of propagating uncertainty, including operations involving
mixed data-type parameters. Thirdly, we present hierarchical heterogeneous propaga-
tion, which automatically determines suitable combinations of the available methods
to ensure that the propagation can be achieved. Fourthly, we produce uncertainty ab-
straction models, which abstract the uncertainty information for visual mapping in vi-
sualizations by providing a common plural value type. Fifthly, we incorporate flexible
visualization capabilities into the spreadsheet using a visualization sheet.
Abstraction from the information uncertainty data type means that traditional data-
type specific visual mapping criteria may no longer be applicable, leaving a gap in the
knowledge. To address this, we investigate user-objectives for information uncertainty
visualization, which describe the characteristics of uncertainty space that the user is
seeking to visualize. User-objectives provide a data-type abstracted means of describ-
ing, executing, and evaluating visualizations.
1.5 Significance
The significance of this work is that it provides the means for intuitive and non-
intrusive environment for modeling and visualizing information uncertainty. This has
three major effects. Firstly, access to information uncertainty visualization is designed
into the system from the outset and it does not require user expertise in uncertainty
techniques to manage information uncertainty. Secondly, uncertainty information is
easily added, changed, or removed at any stage of the process. Thirdly, information
uncertainty visualizations can be built independently of the modeling technique, pro-
viding a coherent foundation for the development of visualization techniques while
reducing their tendency to be ad hoc.
1.6 Organization of the Thesis 7
Information uncertainty is a problem in many fields. Overcoming barriers to its
modeling and visualization is an important step in managing a difficult problem.
1.6 Organization of the Thesis
The organization of this thesis is as follows. Chapter 2 introduces background mate-
rial on uncertainty modeling techniques, visualization techniques, and what has been
done to visualize uncertainty. Chapter 3 describes the framework that integrates infor-
mation uncertainty modeling and visualization tasks together into a coherent whole.
Chapters 4 through 6 cover the components of the framework: Chapter 4 elaborates
on the spreadsheet paradigm as a mechanism for integrating and managing these tasks.
Chapter 5 investigates the encapsulation approach to information uncertainty, which
includes the unified hierarchy and automated propagation. Chapter 6 explores the ab-
straction approach, which includes uncertainty abstraction models and user-objectives
for visualization. Chapter 7 integrates the components into a core system, covering
the requirements, design, and architecture. Chapter 8 considers advanced features and
extensibility of the system. Chapter 9 presents the evaluations of the system, with a
comparative analysis of different approaches, a discussion of a survey, and a case study
in business planning. Chapter 10 provides a conclusion and points to future work.
8 Chapter 1. Introduction
CHAPTER 2
Background
“As far as the laws of mathematics refer to reality, they are not certain;
and as far as they are certain, they do not refer to reality.”
– Albert Einstein1
2.1 Introduction
Information uncertainty is a complex subject that is inherent in many real-world prob-
lems. The uncertainty comes from different sources and can be interpreted and mod-
eled in various ways. There are often subtle interactions between variables and uncer-
tainty, which can be difficult to understand. Visualization of information uncertainty
presents an opportunity to provide deeper insights into the nature of the information,
its uncertainty, and the impact it has on outcomes. However, the difficulty of adopt-
ing information uncertainty and the lack of visualization tool support has caused many
practitioners to ignore the uncertainty completely or to ignore situations where the
1In J. R. Newman (ed.) The World of Mathematics, New York: Simon and Schuster, 1956
10 Chapter 2. Background
uncertainty is deemed too high. This practice results in valuable knowledge being dis-
carded and reduced quality in outcomes, or worse, can even result in entirely wrong
outcomes.
There are aspects of information uncertainty that have been given considerable at-
tention, particularly in the field of mathematics. Two aspects have been especially
well developed: the first includes the various mathematical models that exist for repre-
senting, measuring, and recording uncertainty. The second aspect is the collection of
rules and techniques for propagating, estimating, and minimizing information uncer-
tainty. These models and techniques range from the statistical methods and probabil-
ities through to fuzzy models. Research into visualization of information uncertainty
has only been carried out sporadically during the last decade. Earlier work has focused
on a data-driven approach, with visual data representations for particular data types or
responding to the needs of specific applications. More recent work has investigated
task-based approaches and sought to integrate higher-level issues, such as software
architectures and frameworks for visualization systems.
The aim of this chapter is to provide background for understanding information un-
certainty modeling and visualization by examining relevant works and identifying key
issues. The chapter is organized as follows. Section 2.2 describes information uncer-
tainty in general, covering sources of information uncertainty, understanding informa-
tion uncertainty and its usage, and information uncertainty modeling techniques. Sec-
tion 2.3 discusses relevant issues in visualization, focusing on the process and sense-
making cycle, and visualization techniques. Section 2.4 examines current progress and
key techniques in information uncertainty visualization. A summary of this chapter is
given in Section 2.5.
2.2 Information Uncertainty
In many circumstances the true value of a variable is not fully known, giving rise
to information uncertainty. The information that is known about the variable can be
2.2 Information Uncertainty 11
stored and this technique is referred to as information uncertainty modeling. As an
example, information uncertainty modeling can be used to aid analysis of the potential
environmental impact of a new road. Data is required about the type, amount, and
distribution of vegetation; the variety, location, and habits of local animals; and how
all of these interact. The data that is collected will only be accurate to a certain level
of precision, which can be modeled. Further, much of the information derived from
expert knowledge will be qualitative in nature and thus dependent on interpretation.
It is already a significant task to understand the structure, characteristics, trends,
and interdependency of data. However, information uncertainty serves to complicate
things even further as it requires an understanding of the propagation of uncertainty,
the potential for variation in outcomes, and impacts due to changes in the level of
information uncertainty. Effective visualization of the information and its uncertainty
can help to overcome this problem.
Historically, uncertainty had been regarded as an undesirable factor that is to be
avoided. Only in the 20th century has it become a fundamental component of sci-
ence [62]. However, the term uncertainty itself can vary depending on the author and
the field. For example, Hunter and Goodchild, dealing with spatial databases, reserve
the term uncertainty to refer exclusively to unknown inaccuracy and instead use the
term error for objectively known inaccuracy [47]. Pang et al. use the term uncer-
tainty to cover three categories [86]: statistical, including probabilistic and confidence
methods; error, which refers to differences between estimates and actual values; and
range, which covers intervals of possible values. Klir [58, 60] and Gershon [33] offer
a more general definition of uncertainty as some deficiency in information, and from
there define a measure of information in terms of reduction in uncertainty. Standards
and guidelines have been developed for the management of uncertainty in measure-
ment. One such guide by the National Institute of Standards and Technology (NIST)
describes measurements as approximations and contends that “the result is complete
only when accompanied by a quantitative statement of its uncertainty” [110, pp. 1]. A
12 Chapter 2. Background
similar guide that was issued for analytical chemistry by Eurachem defines measure-
ment uncertainty as a parameter “that characterizes the dispersion of the values that
could reasonably be attributed to the measurand” [29, pp. 4]. The common theme is
that uncertainty can be characterized for a particular unit of information, and we use
the term information uncertainty to refer to situations where this condition holds.
2.2.1 Sources of Information Uncertainty
Pang et al. [86] investigated uncertainty visualization and categorized sources of in-
formation uncertainty based on the point of the visualization process in which it is
introduced. The resulting three categories are acquisition, where information uncer-
tainty is introduced from the measurements and models; transformation, introduced
during the information processing step for visualization; and visualization, referring to
the uncertainty introduced through the act of the visualization itself. These categories
are helpful in characterizing the introduced uncertainty for visualization, but lack gran-
ularity in describing the reason for the uncertainty. Thomson et al. [112] focused on
the tasks of information analysts in the field and used their descriptive terms to derive
a categorization for uncertainty in geospatially referenced information.
Information uncertainty can arise due to a number of reasons. Whenever predic-
tions are made, they are uncertain. Errors and imprecision in measurement are another
common source. The Eurochem guide lists eleven sources of measurement uncer-
tainty [29], but is careful to point out that these may not necessarily be independent.
While their list includes “operator effects” to cover human introduced uncertainty, the
sources are mostly concerned with acts of measurement. Pham and Brown [90] pro-
vide a categorization of uncertainty into three categories: factual, pseudo-measurement
and pseudo-numerical, and perceptual-based. Factual information is numerical and
measurement-based. Pseudo-measurement and pseudo-numerical information are nu-
meric approximations. Perceptual-based information is typically linguistic, but can
also be image- or sound-based. Table 2.1 lists typical sources of information un-
certainty and examples of causes (from [91]). Earlier work by Reznik and Pham
2.2 Information Uncertainty 13
matched nine similar categories of uncertainty sources to uncertainty modeling tech-
niques [100].
Sources of information uncertainty Causes
Limited accuracy
Limitation in measuring instruments,
or computational processes, or
standards.
Missing data
Physical limitation of experiments;
limited sample size or
non-representative sample.
Incomplete definition
Impossibility or difficulty in
articulating exact functional
relationships or rules.
Inconsistency
Conflicts arisen from multiple sources
or models.
Imperfect realisation of a definition Physical or conceptual limitation.
Inadequate knowledge about the
effects of the change in environment
Model does not cover all influence
factors; or is made under slightly
different conditions; or is based on the
views of different experts.
Personal bias Differences in individual perception
Ambiguity in linguistic descriptions
A word may have many meanings; or
a state may be described by many
words.
Approximation or assumptions
embedded in model design methods or
procedures
Requirements or limitations of models
or methods.
Table 2.1: Sources and Causes of Information Uncertainty
2.2.2 Understanding Information Uncertainty
The search for truth is a goal of science and the presence of uncertainty can imply
a deficiency in our understanding. This explains why throughout most of recorded
history scientific thought has sought to avoid uncertainty2. However, attitudes toward
uncertainty have begun to shift, partly due to discoveries such as the Heisenberg un-
certainty principle. Today, uncertainty is viewed as an intrinsic property of problems
in most fields. For example, Couclelis noted that considerable effort had been devoted
to fighting uncertainty in Geographic Information Systems (GIS), but that there are
2The interested reader is directed to Appendix A of [2] for history of perspectives on knowledge.
14 Chapter 2. Background
many things that cannot be known, and the inability to know was not due to human
limitation [19].
Many disciplines use information uncertainty modeling techniques to manage un-
certainty. The incorporation of information uncertainty techniques enables practition-
ers to describe and quantify the uncertainty space. Uncertainty information at the
inputs can then be propagated through the model to the outputs. The output can now
provide additional information. For example, how much confidence we should place in
the result, what alternatives the result may have, and others depending on the modeling
technique and the inputs.
A recent use for information uncertainty techniques is to simplify systems by re-
moving less important information. This mirrors human reasoning, where we reserve
detail for items of interest. For example, when ascertaining whether to jump out of the
way of a moving vehicle, a rough estimate of the vehicle’s velocity is usually sufficient
to determine the appropriate action [77] and precise knowledge of the actual velocity
is usually not necessary.
There are two main approaches to information uncertainty modeling and propaga-
tion. The first approach is to use analytical techniques, which require an understanding
of mathematical principles involved. The second approach is to use numerical tech-
niques, such as Monte-Carlo simulation. Sometimes numerical techniques are used
implicitly without the user realizing, usually by manually varying the inputs and ob-
serving their effects.
Uncertainty is so intrinsic to information that Klir [62, 63, 58, 61, 59, 60] has been
working on a generalized information theory, which has the aims of incorporating
uncertainty and information into a unified theory. Their approach is to conceive of
uncertainty-based information as being the result of a reduction in uncertainty. As a
result of some action, the a priori uncertainty U1 becomes the a posteriori uncertainty
U2 and the information derived from this action is therefore given by U1 −U2 [59].
Using information uncertainty modeling techniques not only provides greater con-
fidence in results, but can also give an indication of how much confidence to place in
2.2 Information Uncertainty 15
the result.
2.2.3 Approaches to Modeling Information Uncertainty
There are numerous uncertainty modeling techniques and we will describe several of
the major ones here. Since the sources and causes of uncertainty are different, various
mathematical models have been developed to faithfully represent different types of
information. We summarize these models into four common types:
Probability denotes the likelihood of an event to occur or a positive match. Prob-
ability theory provides the foundation for statistical inference (e.g. Bayesian
methods) [1, 5].
Possibility provides alternative matches, e.g. a range of errors in measurement [2].
Provability is a measure of the ability to express a situation where the probability of
a positive match is exactly one. Provability is the central theme of techniques
such as Dempster-Shafer calculus.
Membership denotes the degree of match and allows for partially positive matches,
e.g. fuzzy sets [3, 6], rough sets [4].
Probability theory models uncertainty in terms of anticipation: the expectation that an
outcome will eventuate is characterized by a probability.
Classical probability theory describes the ratio between favorable and indifferent
outcomes, which has several shortcomings. This led to the development of frequentest
probability theory, which defines the chance of a given result under random conditions.
Thus, in a repeated experiment the probability of an event will tend toward the ratio
between the number of times it occurs to the number of times the experiment was run:
Pr(x) =
x
x+x
Where Pr : X → [0,1] is the probability function, x ∈ X is the event, and x = X −{x}
is all other outcomes. A probability distribution completely describes the expected
16 Chapter 2. Background
outcomes of a random variable. For real-valued random variables, the probability dis-
tribution can be defined by
F(x) = ∑
xi≤x
Pr(xi)
for discrete probabilities and
F(x) =
x
−∞
f(t)dt
for continuous probabilities, where f is the probability density function. A proba-
bility density function (PDF) is effectively a histogram of expected outcomes, with a
scale such that the integral is unity, f(t)dt = 1.
Probability distributions can take any form, however several well studied distri-
butions exist. Of these, the two most commonly used distributions are the uniform
distribution and the normal distribution. The uniform distribution assigns every out-
come an equal probability. The normal distribution3 has a PDF of
F(x) =
1
σ
√
2π
e
−
(x−μ)2
2σ2
where μ is the mean and σ is the standard deviation. Normal distributions find
common use because the sum of many independent random variables will approximate
a normal distribution.
Monte-Carlo simulation is a numerical approach to uncertainty that uses probabil-
ity distributions [76, 3]. Input variables are assigned a probability distribution, com-
monly a uniform or normal distribution. Numerous random instances are chosen for
the inputs according to these distributions, and the outputs that are calculated can then
be used to characterize the outputs of the system.
The frequentest view of probability models an expectation based purely on fre-
quency of events. An alternative view is Bayesian probability theory, which has found
widespread use in fields such as machine learning and computer vision [89], and
3also known as the Gaussian distribution after Gauss
2.2 Information Uncertainty 17
econometrics [35]. The mathematician Thomas Bayes introduced a theorem that was
generalized by Laplace but is still referred to as Bayes’ Theorem. The theorem relates
the conditional and marginal probability of events of two random variables, x and y:
Pr(x|y) =
Pr(y|x)Pr(x)
Pr(y)
Bayes’ theorem enabled a new philosophical view of probabilities as modeling
belief, using what is called Bayesian inference. Thus, our expectation of an event can
be revised: Pr(x|y) is the posterior probability, our revised expectation of event x given
evidence y; Pr(y|x) is the conditional probability of seeing y given the hypothesis that
x is true; Pr(x) is the prior probability of x; and Pr(y) is the marginal probability of y,
whether or not x is true.
Probabilities, whether frequentest or Bayesian, are not the only means of describ-
ing uncertainty. Classical sets are a means of defining uncertainty. For example, the
diagnosis made by a medical doctor could be that a patient suffers from the flu or the
cold. In this situation there is a possibility that either (or both) be true and there is un-
certainty about which it is. Lotfi Zadeh, in his seminal 1965 paper, proposed fuzzy sets,
which along with fuzzy logic enable human-like reasoning using partial truth4. A fuzzy
set is a set where each member can be assigned a truth value. This can be expressed as a
fuzzy membership function:μ : X → [0,1], where 0 indicates not a member, 1 indicates
definitely a member, and all numbers in between indicate a partial membership.
A good example of fuzzy sets is given by Mendel in [77]. College students were
asked to rank words such as “somewhat” and “quite a bit” against a numerical scale of
quantity. Although individual answers varied significantly, a clear ordering emerged
and Mendel was able to produce a mapping between these words and their indication of
quantity. From there fuzzy sets can be constructed, such as the set lots, which capture
the degree to which numeric values can be representations for each of the words.
4Zadeh was not the first to investigate partial truth and interested readers are directed to the works of
Łukasiewicz [71]
18 Chapter 2. Background
For example, assuming that 24◦C is normal room temperature and 30◦C is com-
pletely hot, temperatures between 24◦C and 30◦C are partially hot. The set of hot
temperatures might therefore be given by the following membership function, illus-
trated graphically in Figure 2.1:
μHot(x) =
⎧
⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎩
1 x > 30
x−24
30−24 24 ≤ x ≤ 30
0 x < 24
20 22 24 26 28 30 32
1
0
Membership
Temperature
Hot
Figure 2.1: Fuzzy Set for Hot
Fuzzy logic is the inference counterpart for fuzzy sets. A complete fuzzy logic sys-
tem consists of a fuzzifier, inference engine, and defuzzifier [77]. The fuzzifier maps
numeric values to partial memberships of fuzzy sets. The inference engine executes
operations and rules on fuzzified information. The defuzzifier converts a fuzzy repre-
sentation into a bi-valued representation, typically using an α-cut.
The graph in Figure 2.2 shows an example fuzzifier that maps room temperatures to
the fuzzy sets Cold, Normal, and Hot. A temperature of 25.5◦C will be wholly outside
the set Cold, mostly within the set Normal, and to a lesser extent partially within the
set Hot.
Methods are defined for several operations including set fuzzy AND (intersection),
OR (union), and NOT (complement)5. Traditional fuzzy logic, also called Zadeh fuzzy
logic after its inventor, is shown graphically in Figure 2.3. Given two fuzzy variables,
5Implementations for operators can vary depending on application and interested reader is directed
to [77]
2.2 Information Uncertainty 19
1
0
Cold Normal Hot
16 20 24 28 32
Temperature
Figure 2.2: Fuzzification for Temperature
a and b:
a∪b = aORb = min(μ(a),μ(b))
a∩b = aANDb = max(μ(a),μ(b))
¬a = NOT a = 1− μ(a)
Set A Set B
A AND BA OR B
NOT A
1
0
1
0
1
0
1
0
1
0
Figure 2.3: Results of Fuzzy Operations are Shown by the Grey Shaded Regions
Thus rules can be established using constructs such as D = A ∧ B ∨ ¬C, where A,
B, C, and D are fuzzy sets.
20 Chapter 2. Background
The defuzzifier typically uses an α-cut, which is a mechanism to translate the fuzzy
output into traditional bi-valued truth, most typically:
μ (d) =
⎧
⎪⎪⎨
⎪⎪⎩
1 μ(d) ≥ α
0 otherwise
where t ∈ [0,1] and α is called the “α-cut plane” and is in the range [0,1]. An example
of defuzzification with an α-cut of 0.5 is given graphically in Figure 2.4. As the graph
shows, the set of normal temperatures is mapped to the interval [21.5,26.5].
1
0
N o r m a l
1 6 2 0 2 4 2 8 3 2
T e m p e r a t u r e ( º C )
alpha cut = 0.5
Figure 2.4: Defuzzification Using an α-cut
A popular representation for uncertainty is the rough set [95]. Rough sets extend
classical sets to allow an element to be both inside and outside the set. Thus there are
three modes: inside, outside, and both. Three operations are defined that translate to
a classical set: the upper limit, the lower limit, and the boundary. The upper limit in-
cludes all items that are wholly inside, or both inside and out. The lower limit includes
only those items that are inside the set. The boundary includes only items that are both
inside and outside the set. This is illustrated graphically in Figure 2.5. An example
application of rough sets is in classifying customer details: the rough set information
provided will contain a customer if all information has been provided, not contain a
customer if no information is provided, and be in both states if some information is
provided but some is missing. The company can send letters requesting information to
2.3 Visualization 21
¬LOWER(c) where c is the “information provided” rough set.
Boundary Lower
Figure 2.5: Example Rough Set for Containment of a Region
Another common classical set-based uncertainty modeling technique is the inter-
val. Intervals define the upper and lower boundaries on a continuum, most commonly
R. The boundaries themselves can be inclusive, which is indicated using square brack-
ets; or exclusive, indicated using rounded brackets. Thus, [0,1) includes zero but ex-
cludes one. Interval arithmetic defines the propagation of uncertainty under common
arithmetic operators. For example, addition is defined as:
[a,b]+[c,d] = [a+c,b+d]
2.3 Visualization
2.3.1 The Sensemaking Process
The user has a visualization objective. To achieve this objective they will use a visual-
ization technique. The technique transforms information and displays it according to
the parameters of the technique. To reach their objective, users will typically iteratively
adjust the parameters and view the results, repeating as often as necessary. There are
three general classes of user objectives, which can also be described as visualization
phases [56]:
22 Chapter 2. Background
1. Exploration, searching the data for relations and patterns
2. Analysis, exploring known relations
3. Presentation, preparing the visualization to communicate information to others
Visualization requires an iteration of choose −inspect −view −ad just, which can be
cumbersome, particularly for novice visualization users. Several studies have there-
fore considered improving the user experience while going through the visualization
process [21, 49, 92, 11].
One study sought to encode the visualization exploration process using an XML-
based language [49]. The encoding captures the parameters used for each iteration of
the loop. By defining a parameter derivation calculus, the results of several visualiza-
tion sessions can then be visualized. Such visualizations of visualization sessions are
designed to aid the user in understanding the progression of their use of the system.
Although the work seeks to formalize the visualization process, it does not improve the
process and is limited to modulating parameters of a particular visualization technique.
A significant drawback of the work in [49] is that the ability to change to another type
of representation or selection of alternate data are not included in the model.
Visualization is a tool and not an objective as of itself. However, it has been ob-
served (e.g. [73]) that some visualizations are good for publications, tending to be
colorful and showy images, but not informative or applicable to real-world problem-
solving. Ma [73] argues that scientists need to be involved in evaluating the effec-
tiveness of visualization methods, and suggests working with users from application
domains both to devise the requirements of the visualization and to subsequently eval-
uate the techniques through case studies.
Other suggestions for overcoming these obstacles is for visualization to be task-
driven instead of data-driven [92, 11]. One approach to this is through an agent-based
framework [92] (see Figure 2.6), where a profile agent observes the user’s choice of
visualizations and adjusts the systems behavior to improve workflow.
2.3 Visualization 23
Figure 2.6: The Multi-agent Visualization Support System
Another proposal is the “Visualization Task Network (VTN)” [11] (see Figure 2.7,
from [11, pp.603]), which can learn the requirements of the user. A VTN is a task-
oriented approach, where the user first selects the task to be achieved. For each chosen
task a set of techniques are proposed by the system. Once a technique is chosen, a
list of attributes (e.g. glyph information, grid spacing, and color) is presented. These
parameters are similar to those in the work of Jankun-Kelly et al. [49]. Each time the
user selects a {task,technique,attribute} set for visualization, the system can increase
the weight of that combination. When the user selects a task, the techniques with the
highest weighting are shown first. Similarly, once a task and technique are chosen, the
attributes with the highest weighting are shown first.
One approach to mapping visual features to visualization techniques takes an objective-
oriented viewpoint, and is derived from the visualization data ontology outlined in [11].
The mapping begins with the choice of data attributes to be represented: relation-
ships, resemblances, order and proportion [34]. These attributes are then mapped to
visual features depending on the visualization task to be performed. The visualization
task is chosen to enhance the perception of the information required by the viewer for
their specific objective in performing the data analysis. The knowledge required for
such a task-oriented approach is encapsulated in an agent-based visualization architec-
ture [11].
A workshop that was held [27] established the visualization ontology represented
in Figure 2.8 (adapted from [27]). Development of the ontology involved investigation
24 Chapter 2. Background
Figure 2.7: The Visualization Task Network (VTN) Learns Task-oriented Visualization
Parameters
of visualization from multiple perspectives. The result produces a clear anatomy of
visualization, except that it is missing one vital part: the role of the user. The user
plays an integral role in the visualization process, driving parameters and the visual-
ization tasks. By excluding the user from the ontology the authors have neglected to
consider not only usability and cultural issues, but also opportunities such as adaptive
visualization systems.
is aboutuses
Representation Data
Visualisation
Task Transformation
supported-by
input to
output from
Visual Haptic
Isosurface
Technique
is realised through
Key
A B A is-a B
A B
p
A
property p between
concepts A and B
Elided hierarchy
Figure 2.8: An Ontology of Visualization
2.3.2 Visualization Techniques
The topic of visualization is traditionally introduced through reference to a taxonomy
of visualization techniques [41, 98, 15, 16]. SIGGRAPH’s visualization education
2.3 Visualization 25
program [41] introduces visualization techniques through a data-type based classifica-
tion, which is reproduced in Figure 2.9. Examples of selected techniques are given in
Figure 2.10. The limitation of this classification is that it deals only with continuous
ordinal values. Visualizations for other types of data, such as trees, are not included.
Figure 2.9: Visualization Techniques Categorized by the Type of Data to be Visualized
Shneiderman [106] recognized the lack of trees and network graphs and addressed
this by including non-ordinal types in their taxonomy. However, the taxonomy itself
continues to be based on the data-type being visualized. The data-types identified are
[106, pp. 337-339]:
26 Chapter 2. Background
(a) 2D Line based Contouring (b) 2D Histogram (c) 3D Streamlines
Figure 2.10: Examples of Selected Visualization Techniques
• 1-Dimensional, such as textual documents, program source code, and alphabeti-
cal lists of names.
• 2-Dimensional, such as geographic maps, floor plans, and newspaper layouts.
• 3-Dimensional, real world objects such as molecules, the human body, and build-
ings.
• Temporal, such as medical records, project management, or hierarchical presen-
tations to create a data-type that is separate from 1-dimensional data.
• Multi-dimensional, such as records in relational databases.
• Trees, which are hierarchies where each item, except the root item, has a link to
its parent.
• Networks, which represent relations that cannot be captured as trees.
OLIVE [98] is an online catalog of visualization systems categorized according to
this taxonomy, although at the time of writing it is only current up to 1997. While
Shneiderman’s taxonomy covers a wider range of visualizations, not all visualization
systems fit conveniently. For example, visualizations that present temporally ordered
3-Dimensional data could fit into either the temporal- or the 3-Dimensional categories.
To overcome these inconsistencies Card and Mackinlay [15] offer a classification based
on additional factors that need to be considered during visualization. Their analysis of
2.3 Visualization 27
visualization systems considers not only the type of data, but also the filtering functions
applied to them, the controlled (text) and automatic (glyph) processing techniques, the
viewing transformations, and the user interaction elements for every variable in the
visualization. Data types are classified as [15, pp. 92-93]:
• Nominal, meaning they are only equal or unequal to other values.
• Ordinal, meaning they obey a less-than relation.
• Quantitative, meaning it is possible to do arithmetic on them.
• Intrinsically spatial, which are the subset of quantitative types that represent spa-
tial points.
• Geographical, which are the subset of intrinsically spatial types that represent
geographic locations.
• A Set mapped to itself, which is the case in graphs and trees.
This taxonomy is cumbersome for purpose of categorizing visualization techniques.
Unlike the preceding taxonomies, there is no single category for a visualization tech-
nique. Instead, each variable used in the visualization is decomposed according to
twelve factors and presented in a matrix. The matrices of two techniques can be com-
pared to pinpoint the exact differences between them. The intent of the authors is not
only to describe the differences in visualization techniques, but also to suggest new
possibilities for visualization techniques [15, pp. 92].
Chi [16] provides a taxonomy to help implementers understand how to implement
visualization techniques. The proposed taxonomy is based on their earlier work on
the Data State Reference Model [18]. A visualization technique is broken down into
four stages according to the state of the data, as shown in Table 2.2, and the data
transformation operators that transform the data from one stage to another, as listed in
Table 2.3.
28 Chapter 2. Background
Stage Description
Value The raw data.
Analytical Abstraction Data about data, or information, a.k.a.
meta-data
Visualization Abstraction Information that is visualizable on the
screen using a visualization technique
View The end-product of the visualization map-
ping, where the user sees and interprets
the picture presented to her
Table 2.2: Data Stages in the Data State Model
Processing Step Description
Data Transformation Generates some form of analytical ab-
straction from the value (usually by ex-
traction).
Visualization Transformation Takes an analytical abstraction and fur-
ther reduces it into some form of visual-
ization abstraction, which is visualizable
content.
Visual Mapping Transforma-
tion
Takes information that is in a visualizable
format and presents a graphical view.
Table 2.3: Transformation Operators in the Data State Model
Tory and Möller [113] argue that these taxonomies are vague because of the termi-
nology used. As an example they cite the use of the word “often” in Card and Mackin-
lay’s definition [113, pp.1]. To reduce this ambiguity, their taxonomy is based on the
data model rather than the type of data itself. A data model is a representation of data
that may include structure, attributes, relationships, and the data values themselves.
Visualization algorithms create visual representation of data using a data model. The
taxonomy is outlined in Figure 2.11. The visualization algorithms are first classified
as continuous or discrete. Scientific visualization corresponds largely to continuous
models while information visualization corresponds largely to discrete models.
Unlike Card and Mackinlay’s taxonomy, the model-based taxonomy maintains
scalar, vector, and tensor categories for dependent variables. Additionally the taxon-
omy shows greater flexibility than Shneiderman’s taxonomy, categorizing temporally
ordered 3D data as nD data in the continuous model. A limitation of the taxonomy is
that it does not treat temporal data as distinct from 1D data.
2.4 Information Uncertainty Visualization Approaches 29
Figure 2.11: The Model-based Visualization Taxonomy
2.4 Information Uncertainty Visualization Approaches
Johnson and Sanderson argue that “development of formal theoretical frameworks and
the new visual representations of error and uncertainty will be fundamental to a better
understanding of 3D experimental and simulation data” [51, pp. 5]. This is a relatively
new field in visualization, which is generally referred to as uncertainty visualization.
However, uncertainty and its sources are diverse and the term can have broad meaning.
On the other hand, error visualization (e.g. [51, 83]) and fuzzy visualization (e.g. [93,
39, 5]) imply particular uncertainty modeling techniques. In this thesis we use the
term information uncertainty visualization to refer to visualization of all modeling
techniques where the uncertainty can be codified in information. Thus error- and fuzzy-
visualization are sub-categories of information uncertainty visualization, which is itself
a sub-category of uncertainty visualization. This relationship is shown in Figure 2.12.
Visualization techniques map data variables and information to visual feature di-
mensions for the purpose of highlighting trends, making comparisons, establishing
outliers, examining data composition, and similar reasons. The introduction of uncer-
tainty requires that appropriate visual features be selected to represent it. Blurring sim-
ulates the visual precept caused by an incorrectly focused visual system and therefore
has the most immediately intuitive mapping for uncertainty [91]. Blurring effectively
30 Chapter 2. Background
uncertainty visualization
information
uncertainty
visualization
error
visualization
f u z z y
visualization
Figure 2.12: Relationship between Uncertainty Visualization, Information Uncertainty
Visualization, Error Visualization, and Fuzzy Visualization
smears the boundary of the graphic representing the data value, creating a sense of
uncertainty as to where it begins and ends. A number of visual features may be used
in a similar manner, including hue, luminance, saturation, and can be extended into the
temporal domain through animation [67, 34, 10].
Brown [10, pp. 84] offers a summary of available features drawn from the literature
(e.g. [44, 33, 90]):
• Intrinsic representations - position, size, brightness, texture, color, orientation,
and shape;
• Further related representations - boundary (thickness, texture and color), blur,
transparency and extra dimensionality;
• Extrinsic representations - dials, thermometers, arrows, bars, different, shapes,
and complex objects - pie charts, graphs, bars, or complex error bars
2.4.1 Low-level Features
We now consider how several low-level features can be used to indicate uncertainty
within information. The features to be considered are: hue and luminance, opacity,
depth, texture, particles, glyphs, and sonification.
2.4 Information Uncertainty Visualization Approaches 31
Hue and Luminance are commonly used to highlight data that is different, or to rep-
resent gradients in the data [115, 56]. Saturation of the hue can be used to high-
light the precision or certainty of the data. The more saturated the hue, the
more certain or crisp the value contained in that region is, while low saturation
regions have the appearance of washing into each other, and can be used to in-
dicate the fuzziness of spatial region boundaries [50, 42]. Variation in hue can
also be used to indicate precision. Regions of higher uncertainty can have fewer
shades, while more precise areas have a smoother appearance. A lack of back-
ground/foreground separation (e.g. red on purple) can also imply uncertainty, as
the region may only just be distinguishable [122]. Brown and Pham [93] used
the color hues to represent the membership values of data points. Color hues
were also used by Lowe et al. [70] to represent belief values in the form of a
flame to facilitate decision making in an anaesthetic monitoring system.
Opacity offers an intuitive method for creating blurriness. The more uncertain regions
can be shown with reduced opacity, creating a ghost-like effect. The inverse ap-
proach, used by Djurcilov et al. [24, 25], is to map regions of high uncertainty
to high opacity, thus drawing attention to the uncertain areas in volume visu-
alization (see Figure 2.13). Johnson and Sanderson [51] show an example of a
Magnetic Resonance Imaging (MRI) scan with an added error volume. The error
volume represents the space of possible variation and is transparent so that the
other data is still visible.
Depth can be used to indicate an order or spatial positioning for the data. Pang et
al. [86] and Brown [10] displayed intentionally different images to each eye,
exploiting a lack of binocular fusion to indicate fuzziness. Blurring or depth
of field effects from spatial frequency components being removed in the image
plane can be used to show the indistinct nature of data points [34, 64].
Texture may be applied to objects to indicate the level of precision, ambiguity or
fuzziness in the spatial location upon an object or upon a spatial location. Pang
32 Chapter 2. Background
Figure 2.13: Using Opacity to Show the Structure of Uncertainty. Color Scheme (left),
Normal Rendering (center), Uncertainty Structure (right)
and Alper [84] used random normal perturbation to create a textured surface.
The effect was proportional to the amount of uncertainty, creating rough regions
where the uncertainty is high. Certain shimmering effects, usually to be avoided
in visualization [115], can be used to indicate ambiguity within the region [111].
Particles can be used to represent the uncertainty of a region or object by varying
their density, opacity, and color. Grigoryan and Rheingans [37, 38] use particle
density to indicate uncertainty. These particle clouds create a similar effect to
transparent volumes. Cartography often also uses a form of this by drawing
dashed lines to represent imprecise lines and boundaries, or by using different
dot densities to represent shading effects [36].
Glyphs are the most widespread methods for displaying uncertainty. The size of a
glyph is often used to indicate a scalar measure of uncertainty. For example, error
bars are a traditional technique for indicating errors in measurement [115]. The
larger the error bar, the more uncertainty there is. This concept was expanded
upon by Pang and Freeman [85], who used the size of spherical and ellipsoidal
glyphs to indicate uncertainty in radiosity applications. Lodha et al. [67] inves-
tigated uncertainty glyphs for flow visualizations, also using length to indicate
degree of disagreement. In separate work [68] they used glyphs to show variation
between surface interpolants, finding them to be more precise than using other
2.4 Information Uncertainty Visualization Approaches 33
features. Wittenbrink et al. [123] mapped variation in vectors to glyph length
and width, to show uncertainty in magnitude and direction. In the same work
they explored glyphs in keyframed animation to expose differences between in-
terpolation techniques.
Sonification is an approach that was explored early on. There are two main meth-
ods, one is to map the uncertainty directly to the pitch or volume, while the
second uses the degree of uncertainty to regulate a noise generator. Fisher [31]
allowed the user to scan a cursor over a landscape while the program emitted
sound depending on the degree of uncertainty. Lodha et al. [66] went further by
allowing multiple sound variables to be mapped simultaneously, thus increased
the amount of information conveyed.
These low-level features offer an added dimension to which we can map uncertainty
information for a particular plot point. Zhou and Pang [124] looked at several examples
to visualize the level of error between original and reduced resolution meshes in a
multi-resolution mesh algorithm (see Figure 2.14). We now consider how these are
used in higher level constructions and methods that require multiple data points. In our
discussion we include different spatial arrangements, use of image based techniques,
addition and modification of geometry, and the use of animation.
Figure 2.14: Visual Mappings Showing Difference. From Left to Right: Overlay,
Rainbow Mapping, White-black-white Pseudo-coloring, Glyph (Hi-pass), Glyph (low-
pass)
34 Chapter 2. Background
2.4.2 Higher-level Constructions
Uncertainty can be represented in several ways using 2D Cartesian graphs. Some
examples of graphs include histograms, bar charts, tree diagrams, time histories of 1D
slices, maps, iconic and glyph-based diagrams. For example, graphs are often used to
represent the fuzzy membership functions (e.g. Figures 2.1-2.4) or probability density
functions. The structure and inter-relationships of rules can be illustrated using graphs,
trees and flowcharts.
Fuzzy rules involving two inputs can be graphed in three dimensions. Figure 2.15
shows an example from the Matlab Fuzzy Toolbox [75], where the output shows the
amount of tip, as determined by the quality of the food and service. Nürnberger ex-
plored drawing such classifiers as overlapping pyramid shapes. 2D classifiers are visu-
alized as contours for a top-down view [80], whereas 3D classifiers are 3D shapes. An
extension to this work discusses the effects that antecedent pruning has on the shapes
[81]. Pruning of antecedents involves removal of restrictive rules and simplification
of existing rules with the aim of improving the ability of the classification system to
generalize to previously unseen input data. The authors argue that rule simplifications
can have a dramatic impact on results and that visualization of these changes can pro-
vide an intuitive aid for fuzzy classifier designers. While the technique produces an
intuitive aid, the authors have not gone far enough. Since the classifier is visualized as
a shape that occupies the same space as the data, it suggests that it can be visualized
together with the data. This would allow the user to observe how the data points of a
particular data set classify, particularly when combined with animation or interactive
techniques. Possible extensions include using size, color, and translucency to enhance
perception of the classification given to a data point. Cox et al. [20] applied thresholds
to produce convex hull plots of data point clusters, using glyphs of different shapes and
sizes for the data points.
A limitation of these techniques is that they are not well suited to multi-dimensional
data. Techniques such as multi-dimensional scaling [6] and parallel coordinates [39]
2.4 Information Uncertainty Visualization Approaches 35
Figure 2.15: Tip Level Based on the Quality of the Food and Service Using Fuzzy
Inference
provide ways to display multi-dimensional fuzzy data in 2D without loss of informa-
tion. However, the degree of membership is not indicated in a standard parallel coordi-
nate plot. Berthold and Hall [5] use blurring to expose the level of fuzzy membership
on parallel coordinates. An alternative proposal by Pham and Brown [90] extends
coordinates to the third dimension, where the new dimension represents the member-
ship value. One technique for multi-dimensional scaling involves an algorithm that
minimizes the inter-point distances. The rule set is then visualized as a 2D scatter
plot, where gray scales denote different classes and the size of each square indicates
the number of examples [93]. Another technique for viewing high-dimensional fuzzy
rules in 2D places rules as shapes on a grid. The distance between rules in high-
dimensional space is mapped to their distance in 2D. The technique uses a gradient
descent algorithm to minimize the error between the 2D and actual distances [6].
When visualizing clusters it is often a requirement to find outliers in the data. One
method to improve the identification of outliers in fuzzy classification problems is to
modify the “objective function”. Keller proposed additional weighting parameters for
“representativeness” [55, pp. 143]. The application of this technique produces the
same principal clusters, but outliers are more easily detected since they are excluded
to a greater degree from the fuzzy clusters.
Fujiwara et al. [32] and Gershon [34] produced a 3D flowchart to represent rule
structure to facilitate understanding of rule-based programs. This is an extension of
the cone tree visualization technique [101]. Dickerson et al. [23] used a graph to
36 Chapter 2. Background
encode relationships in a complex interacting system. This technique is useful for
encoding expert information which is commonly present in fuzzy control systems.
Brown and Pham [11] extended these techniques further to by mapping additional
features to uncertainty (such as opacity) for each node.
Image based techniques can also be used to convey uncertainty. These methods are
the uncertainty analogs of image based visualization techniques such as Line Integral
Convolution (LIC) [40]. In these methods a pattern is generated that abstractly reflects
the uncertainty. One difference between image based techniques and glyphs is that
image based techniques apply a regular pattern over a continuous area. This avoids
clutter sometimes experienced by glyph techniques where the glyphs obstruct one an-
other. Sanderson et al. [103] used reaction-diffusion models in flow visualizations and
conveyed uncertainty through spot size and orientation.
Pang and Freeman [85] (see also [86]) observed that geometry can be added or
modified to indicate uncertainty. An example of modification is to create a texturing-
like effect by perturbing the orientation of faces within a geometric mesh model. The
amount of perturbation is governed by the degree of uncertainty. There are two com-
mon examples of adding geometry. The first is to add geometry for a single data point,
typically to give a direct indication of the extent over which the object can exist. An-
other is to connect successive data point extents, simulating the volume of possibility.
An example of the latter was demonstrated by Lopes and Brodlie [69], who used tubes
for particle flow visualization.
All of the low-level visual features that have been discussed can be animated. For
example, using motion blur, flickering, animated glyphs, etc. to represent the precision
of the measurements of a moving object [123, 10]. Brown [10] explored temporal vi-
brations for conveying uncertainty. The vibrations oscillate between values fast enough
to be pre-attentive [114], causing a shimmering effect that implies uncertainty about
its true position. Figure 2.16 shows frames from a movie of the luminance oscillation
technique, with a region of high uncertainty framed within the dashed rectangle. This
technique can also be applied in stereographic displays to facilitate a lack of binocular
2.4 Information Uncertainty Visualization Approaches 37
fusion [10].
Figure 2.16: Two Frames from a Luminosity Oscillation Animation
Probability distributions are often graphed as 2D line graphs. Kao et al. [52, 53,
54] explored showing multiple data points, each of which is subject to a probability
distribution. In one example they overlaid the probability density functions for points
of interest, as shown in Figure 2.17. Other approaches included using color, texture,
and heightmaps to indicate uncertainty. Luo et al. [72] plotted many small histograms
in a small multiples [115] technique.
Figure 2.17: Visualization With Probability Density Functions Over Associated Data
Points
The Geographic Information Systems (GIS) field has had a particular interest in
information uncertainty visualization. MacEachren et al. [74] and Slocum et al. [107]
methodically review the state of play with respect to uncertainty visualization in GIS.
38 Chapter 2. Background
Outside of this field, the development of visualization techniques for information un-
certainty is typically ad hoc, being created for a specific modeling technique or appli-
cation. All of these represent important steps forward, however, an integrated frame-
work to manage the modeling and visualization of information uncertainty is currently
missing.
2.5 Summary
In summary, the information uncertainty modeling and general visualization fields
have separately been well studied. Several visualization techniques have been created
for the various information uncertainty models. However, which one is best suited to
the task at hand, and how will uncertainty modeling and propagation be tracked and
interpreted properly? What happens when the information uncertainty changes? In-
formation uncertainty modeling represents our knowledge and expectation about the
behavior of a variable under uncertainty, and this knowledge may be subject to change
over time, particularly as new information comes to light. Currently, there is no inte-
grated framework for the modeling, propagation, and visual mapping of information
uncertainty. Furthermore, there is no framework that can adapt to changes in informa-
tion uncertainty.
CHAPTER 3
Framework for Integrated Uncertainty
Modeling and Visualization
3.1 A New Approach to Information Uncertainty
Traditional visualization systems, which typically do not deal with information uncer-
tainty, can still be subject to dynamic data. For these systems the dynamism refers to
changes in value. The result is that the visualization needs to be recalculated, which is
a straight forward process. Changes in information uncertainty, on the other hand, pro-
vide a unique challenge: the actual modeling technique used can change in response to
changing information. Therefore, the data-type of the variable can be dynamic. This
is clearly illustrated by the case of prediction: before the event comes to pass, the un-
certainty can be modeled using a number of techniques; once the event has passed, the
prediction can be updated with the actual outcome. 1Thus the visualization must not
only be recalculated, but must also adapt to this new data-type.
Adapting to new data-types for a visualization is not a straight forward process.
1Assuming the outcome is known, the data-type then becomes one of absolute certainty.
40 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
Visualization techniques are designed for a particular data-type and may not support
the new data-type without modification. For example, line graphs rely on a series of
values between which line segments are connected. Should the source of information
be defined by a series of intervals, then the traditional line graph is no longer appro-
priate. One suitable modification turns the line segments into convex polygons, whose
edges are defined by the upper and lower bounds of the interval.
It is recognized that it is important that visualization systems convey uncertainty [83].
Many problems are subject to uncertainty and, as a consequence, visualization research
has produced several visualization technique modifications to support the uncertainty.
For example, transparent volumes have been added to volume renderings to indicate
potential error (e.g. [51, pp.9]), and parallel plots were extended into the third dimen-
sion to handle fuzzy variables [93]. This type of work continues, and the outcomes
continue to be data-type specific.
The objective of this thesis is to integrate the process of modeling and visualizing
information uncertainty into an extensible and adaptive visualization framework. Such
a framework will provide greater uniformity for the field and enable both practitioners
and researchers to reduce the data-type and visualization technique dependency. The
process that the user follows when armed with such a tool can therefore change.
The typical process that a user follows when dealing with uncertainty consists of
the following steps.
1. Decide on variables
2. Decide on uncertainty data-type(s)
3. Build the data model, propagating uncertainty manually
4. Construct visualization(s), incorporating uncertainty where techniques are avail-
able and appropriate
In practice, steps 3 and 4 will be repeated as information changes. However, step 2 will
rarely be revisited. The significant point of this process is that the uncertainty model
3.2 Analysis of Issues and Requirements 41
is decided upon before the user’s data model is built. This can be unintuitive, as the
amount of uncertainty can change depending on how the model pans out. If it were
easy to add, change, or remove uncertainty details at any point in the process, then the
typical process changes, as follows.
1. Decide on variables
2. Build an initial data model
3. Construct visualization(s)
4. Add/remove/change uncertainty information
Where step 4 can occur anywhere after step 1, and can be repeated as often as is
necessary. Under such a process the uncertainty information is viewed as a refinement
on details that does not fundamentally change the data model.
This chapter describes an integrated framework for the modeling and visualization
of information uncertainty. This framework is adaptive to changes in uncertainty in-
formation, allowing the user to select the appropriate techniques for the task at hand.
In Section 3.2 we consider the issues that must be overcome, from which we derive re-
quirements of this framework. Section 3.3 describes the components of the framework
to meet the requirements. Section 3.4 provides a summary of key points.
3.2 Analysis of Issues and Requirements
This section examines the issues that confront users when they seek to visualize infor-
mation uncertainty. From a theoretical perspective there are three main issues. Firstly,
visualization techniques are based around specific uncertainty data-types. Thus, vi-
sualization techniques tend to be ad hoc. Secondly, there is incoherence between in-
formation uncertainty modeling techniques. This locks users into a particular model-
ing technique, the appropriateness of which may change as the information evolves.
Thirdly, information uncertainty modeling and visualization is hampered by an artifi-
cial separation between the value of a variable and the uncertainty model of that value.
42 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
This poses problems that affect the robustness of user models and the effort required
to maintain them.
From a practical point of view, the user is required to have both a comprehen-
sive understanding of uncertainty as well as sophistication with visualization tools.
Comprehensive understanding is required, because the user must manually encode and
propagate uncertainty information; sophistication with visualization tools is required
to allow for the unusual demands of mapping uncertainty to visual elements. Many
tools lack support for information uncertainty modeling and visualization, leading de-
termined users to cobble together multiple tools.
3.2.1 Ad hoc Visualization Techniques
Sensemaking Cycle and Changes in Uncertainty Information
Visualization is “the bringing out of meaning in information” [56]. It is performed it-
eratively and usually as part of the sensemaking cycle [102, 17]. The iterative looping
is not exclusive to mapping data into visual form; instead, users sometimes return to
the data model to gather or transform data. This is particularly true for information un-
certainty. For example: uncertainty details can be deemed to be more important later,
once the basic model is in place; or the uncertainty details may change as more be-
comes known about the variables. Therefore, frameworks for information uncertainty
visualization should ideally allow the user to go back to make changes with minimal
effort.
Flexibility
Visualization of information uncertainty is different to visualizing other forms of in-
formation for two main reasons. Firstly, information uncertainty is always associated
with a particular unit of information. This means that the uncertainty cannot be freely
visualized without regard to its interpretation relative to the information to which it
belongs. Secondly, information uncertainty is usually mapped differently to visual el-
ements. For example, uncertainty is commonly mapped to intrinsic properties, such as
3.2 Analysis of Issues and Requirements 43
transparency or color; or by adding a dimension to geometry, such as using a surface
where there would otherwise be a line. Therefore, a visualization system for informa-
tion uncertainty requires the flexibility to allow users to map uncertainty to compound
visual elements, including intrinsic properties and adding dimensions to geometry.
Figure 3.1 demonstrates how information uncertainty is associated with informa-
tion, but typically mapped differently to visual elements. Four graph visualizations of
historical and predicted employment rates in California are shown. The first graph (a)
assumes that growth will continue at the average growth rate of the past 15 years and
is therefore visualized using traditional means. While the information in graph (a) is
modeled as not being subject to uncertainty, it requires the unreliable assumption about
employment rates to be made. The graph in (b) estimates that the growth will continue
at the average rate. The fact that the predictions are estimates is indicated by the line
stippling, an intrinsic property of the line. The graph in (c) shows the possible range
within the maximum and minimum growth rates experienced in the past 15 years. The
uncertainty is indicated by extending the one dimensional line into a two dimensional
polygon. The graph in (d) uses a normal distribution centered on the average growth
rate. The uncertainty is indicated by both extending the dimensionality of the line as
well mapping to the intrinsic property of opacity.
Heterogeneity in Uncertainty Information
Several uncertainty visualization techniques have been developed for particular uncer-
tainty types. However, in an environment where the uncertainty type can change to bet-
ter suit the needs of the user, such restrictive preconditions for visualization techniques
provide a return to the tyranny of uncertainty type lock-in. Therefore the approach to
visualization of information uncertainty requires visualization to provide greater con-
sistency across different uncertainty modeling techniques.
44 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
(a) (b)
(c) (d)
Figure 3.1: Visualizations of Employment Numbers in California. Years 2005-2010
are Predicted. (a) Assuming Average Growth (b) Indicating Growth is Estimated (c)
Possible Growth (d) Likely Growth. (Data Source: California Employment Develop-
ment Department)
3.2 Analysis of Issues and Requirements 45
Homogeneous Access
To enable the visual mappings that expose the uncertainty in variables, it is necessary
to have access to the associated uncertainty details. However, there are numerous un-
certainty modeling techniques that use different methods for encoding the uncertainty.
This creates a barrier to visualizing uncertain information because visual mappings
that work with one uncertainty modeling technique may not be easily transferable to
another. Such inconsistency creates a strong dependency between visualizations and
the data-types used in the model, limiting the user’s ability to update the data model.
Therefore, a generalized means for accessing uncertainty information should be sought
to enable a consistent environment information uncertainty visualization.
Plurality of Values
Fundamental to the concept of information uncertainty is the ability for a variable to
hold multiple values simultaneously; in other words, the variable has multiple possible
collapses. This plurality of values represents the deferral of the approximation decision
- the true value of a variable may be one of multiple candidates, each of which should
be considered a possibility.
3.2.2 Incoherence of Uncertainty Models
Uncertainty Data Type Lock-in
There is usually no support for changing from one uncertainty modeling technique to
another. Adding uncertainty information to data allows the user to specify a greater
level of detail about the data. However, changing the uncertainty data-type typically
requires users to reconstruct the affected portion of the data model, often involving a
fundamental change in form. This makes the data model rigid and, as a consequence,
users will typically need to anticipate their use of uncertainty and build their model
accordingly.
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted
thesis_submitted

More Related Content

What's hot

Adversarial Multi Scale Features Learning for Person Re Identification
Adversarial Multi Scale Features Learning for Person Re IdentificationAdversarial Multi Scale Features Learning for Person Re Identification
Adversarial Multi Scale Features Learning for Person Re Identificationijtsrd
 
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET Journal
 
Zhou_HCI_CAVIAR.doc
Zhou_HCI_CAVIAR.docZhou_HCI_CAVIAR.doc
Zhou_HCI_CAVIAR.docbutest
 
IRJET- Facial Emotion Detection using Convolutional Neural Network
IRJET- Facial Emotion Detection using Convolutional Neural NetworkIRJET- Facial Emotion Detection using Convolutional Neural Network
IRJET- Facial Emotion Detection using Convolutional Neural NetworkIRJET Journal
 
IRJET - Clustering Algorithm for Brain Image Segmentation
IRJET - Clustering Algorithm for Brain Image SegmentationIRJET - Clustering Algorithm for Brain Image Segmentation
IRJET - Clustering Algorithm for Brain Image SegmentationIRJET Journal
 
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...IRJET- A Study of Different Convolution Neural Network Architectures for Huma...
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...IRJET Journal
 
Face Recognition using PCA and MSNN
Face Recognition using PCA and MSNNFace Recognition using PCA and MSNN
Face Recognition using PCA and MSNNRABI GAYAN
 
Deep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentDeep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentIJECEIAES
 
Pattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsPattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsKaveen Prathibha Kumarasinghe
 
IRJET - Detection for Alzheimer’s Disease using Image Processing
IRJET -  	  Detection for Alzheimer’s Disease using Image ProcessingIRJET -  	  Detection for Alzheimer’s Disease using Image Processing
IRJET - Detection for Alzheimer’s Disease using Image ProcessingIRJET Journal
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET Journal
 

What's hot (14)

Adversarial Multi Scale Features Learning for Person Re Identification
Adversarial Multi Scale Features Learning for Person Re IdentificationAdversarial Multi Scale Features Learning for Person Re Identification
Adversarial Multi Scale Features Learning for Person Re Identification
 
Z suzanne van_den_bosch
Z suzanne van_den_boschZ suzanne van_den_bosch
Z suzanne van_den_bosch
 
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET- Spot Me - A Smart Attendance System based on Face Recognition
IRJET- Spot Me - A Smart Attendance System based on Face Recognition
 
Zhou_HCI_CAVIAR.doc
Zhou_HCI_CAVIAR.docZhou_HCI_CAVIAR.doc
Zhou_HCI_CAVIAR.doc
 
IRJET- Facial Emotion Detection using Convolutional Neural Network
IRJET- Facial Emotion Detection using Convolutional Neural NetworkIRJET- Facial Emotion Detection using Convolutional Neural Network
IRJET- Facial Emotion Detection using Convolutional Neural Network
 
IRJET - Clustering Algorithm for Brain Image Segmentation
IRJET - Clustering Algorithm for Brain Image SegmentationIRJET - Clustering Algorithm for Brain Image Segmentation
IRJET - Clustering Algorithm for Brain Image Segmentation
 
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...IRJET- A Study of Different Convolution Neural Network Architectures for Huma...
IRJET- A Study of Different Convolution Neural Network Architectures for Huma...
 
Sub1528
Sub1528Sub1528
Sub1528
 
Face Recognition using PCA and MSNN
Face Recognition using PCA and MSNNFace Recognition using PCA and MSNN
Face Recognition using PCA and MSNN
 
Final_Thesis
Final_ThesisFinal_Thesis
Final_Thesis
 
Deep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentDeep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environment
 
Pattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsPattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformatics
 
IRJET - Detection for Alzheimer’s Disease using Image Processing
IRJET -  	  Detection for Alzheimer’s Disease using Image ProcessingIRJET -  	  Detection for Alzheimer’s Disease using Image Processing
IRJET - Detection for Alzheimer’s Disease using Image Processing
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
 

Viewers also liked

5 Young Adult Health Tips for New Graduates
5 Young Adult Health Tips for New Graduates5 Young Adult Health Tips for New Graduates
5 Young Adult Health Tips for New GraduatesWestValleyMedicalCenter
 
Enfermeria del siglo XXI. De la cofia al iPad
Enfermeria del siglo XXI. De la cofia al iPadEnfermeria del siglo XXI. De la cofia al iPad
Enfermeria del siglo XXI. De la cofia al iPadAndoni Carrion
 
Veille scientifique
Veille scientifiqueVeille scientifique
Veille scientifiqueAzzaz Rachid
 
Profit with purpose businesses: Mission alignment paper
Profit with purpose businesses: Mission alignment paperProfit with purpose businesses: Mission alignment paper
Profit with purpose businesses: Mission alignment paperMohammad Al-Ubaydli
 
CV-EHTISHAM (Tekla Steel Detailer and checker)
CV-EHTISHAM (Tekla Steel Detailer and checker)CV-EHTISHAM (Tekla Steel Detailer and checker)
CV-EHTISHAM (Tekla Steel Detailer and checker)Ehtisham Khan
 
Energías alternativas
Energías alternativasEnergías alternativas
Energías alternativasLorena Perez
 
CDMA cellular radio network
CDMA cellular radio networkCDMA cellular radio network
CDMA cellular radio networkoDesk
 
Stepdown Transformer By Nagmani
Stepdown Transformer By NagmaniStepdown Transformer By Nagmani
Stepdown Transformer By NagmaniWakil Kumar
 
Better Metal BGL-2R-1H38
Better Metal BGL-2R-1H38Better Metal BGL-2R-1H38
Better Metal BGL-2R-1H38savomir
 
Iora Strategy Project Final version (3)
Iora Strategy Project Final version (3)Iora Strategy Project Final version (3)
Iora Strategy Project Final version (3)jeremy gates
 
Cyber crime liability report
Cyber crime liability reportCyber crime liability report
Cyber crime liability reportSayali Sawant
 
Launch 3 L3-1700
Launch 3 L3-1700Launch 3 L3-1700
Launch 3 L3-1700savomir
 
NEW CV MUHAMMAD ABDULLAH 13-5-2016
NEW CV MUHAMMAD ABDULLAH   13-5-2016NEW CV MUHAMMAD ABDULLAH   13-5-2016
NEW CV MUHAMMAD ABDULLAH 13-5-2016abdullahgurmani
 
RF Industries RFC-6674-60
RF Industries RFC-6674-60RF Industries RFC-6674-60
RF Industries RFC-6674-60savomir
 

Viewers also liked (20)

5 Young Adult Health Tips for New Graduates
5 Young Adult Health Tips for New Graduates5 Young Adult Health Tips for New Graduates
5 Young Adult Health Tips for New Graduates
 
Enfermeria del siglo XXI. De la cofia al iPad
Enfermeria del siglo XXI. De la cofia al iPadEnfermeria del siglo XXI. De la cofia al iPad
Enfermeria del siglo XXI. De la cofia al iPad
 
Veille scientifique
Veille scientifiqueVeille scientifique
Veille scientifique
 
Profit with purpose businesses: Mission alignment paper
Profit with purpose businesses: Mission alignment paperProfit with purpose businesses: Mission alignment paper
Profit with purpose businesses: Mission alignment paper
 
CV-EHTISHAM (Tekla Steel Detailer and checker)
CV-EHTISHAM (Tekla Steel Detailer and checker)CV-EHTISHAM (Tekla Steel Detailer and checker)
CV-EHTISHAM (Tekla Steel Detailer and checker)
 
Energías alternativas
Energías alternativasEnergías alternativas
Energías alternativas
 
Elena duque blog
Elena duque blogElena duque blog
Elena duque blog
 
CDMA cellular radio network
CDMA cellular radio networkCDMA cellular radio network
CDMA cellular radio network
 
Evaluation 1
Evaluation 1Evaluation 1
Evaluation 1
 
Stepdown Transformer By Nagmani
Stepdown Transformer By NagmaniStepdown Transformer By Nagmani
Stepdown Transformer By Nagmani
 
Better Metal BGL-2R-1H38
Better Metal BGL-2R-1H38Better Metal BGL-2R-1H38
Better Metal BGL-2R-1H38
 
Iora Strategy Project Final version (3)
Iora Strategy Project Final version (3)Iora Strategy Project Final version (3)
Iora Strategy Project Final version (3)
 
Cyber crime liability report
Cyber crime liability reportCyber crime liability report
Cyber crime liability report
 
Launch 3 L3-1700
Launch 3 L3-1700Launch 3 L3-1700
Launch 3 L3-1700
 
ท่ารำหาดส้มแป้น
ท่ารำหาดส้มแป้นท่ารำหาดส้มแป้น
ท่ารำหาดส้มแป้น
 
Pursuing his presence - GOD
Pursuing his presence - GODPursuing his presence - GOD
Pursuing his presence - GOD
 
Choosing a Career (1)
Choosing a Career (1)Choosing a Career (1)
Choosing a Career (1)
 
asrasr 253.04.2016
asrasr 253.04.2016asrasr 253.04.2016
asrasr 253.04.2016
 
NEW CV MUHAMMAD ABDULLAH 13-5-2016
NEW CV MUHAMMAD ABDULLAH   13-5-2016NEW CV MUHAMMAD ABDULLAH   13-5-2016
NEW CV MUHAMMAD ABDULLAH 13-5-2016
 
RF Industries RFC-6674-60
RF Industries RFC-6674-60RF Industries RFC-6674-60
RF Industries RFC-6674-60
 

Similar to thesis_submitted

Missing Data Problems in Machine Learning
Missing Data Problems in Machine LearningMissing Data Problems in Machine Learning
Missing Data Problems in Machine Learningbutest
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...Nischal Lal Shrestha
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image RetrievalLéo Vetter
 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability EvaluationJPC Hanson
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_VisualizationHongfu Huang
 
Stock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_AnalysisStock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_AnalysisOktay Bahceci
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - CopyBhavesh Jangale
 
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...GEORGIOS KONSTANTINOS KOURTIS
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
 
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODSRETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODSIRJET Journal
 
Spatial_Data_Analysis_with_open_source_softwares[1]
Spatial_Data_Analysis_with_open_source_softwares[1]Spatial_Data_Analysis_with_open_source_softwares[1]
Spatial_Data_Analysis_with_open_source_softwares[1]Joachim Nkendeys
 
Honours Research Thesis B Smalldon
Honours Research Thesis B SmalldonHonours Research Thesis B Smalldon
Honours Research Thesis B SmalldonBarry Smalldon
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
iGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportiGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportNandu B Rajan
 

Similar to thesis_submitted (20)

Missing Data Problems in Machine Learning
Missing Data Problems in Machine LearningMissing Data Problems in Machine Learning
Missing Data Problems in Machine Learning
 
Report
ReportReport
Report
 
MSc_Thesis
MSc_ThesisMSc_Thesis
MSc_Thesis
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
 
Abrek_Thesis
Abrek_ThesisAbrek_Thesis
Abrek_Thesis
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
FinalReport
FinalReportFinalReport
FinalReport
 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability Evaluation
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_Visualization
 
Big data
Big dataBig data
Big data
 
Stock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_AnalysisStock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_Analysis
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
 
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...
ML guided User Assistance for 3D CAD Surface Modeling: From Image to Customiz...
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...
 
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODSRETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
 
Spatial_Data_Analysis_with_open_source_softwares[1]
Spatial_Data_Analysis_with_open_source_softwares[1]Spatial_Data_Analysis_with_open_source_softwares[1]
Spatial_Data_Analysis_with_open_source_softwares[1]
 
final
finalfinal
final
 
Honours Research Thesis B Smalldon
Honours Research Thesis B SmalldonHonours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
iGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportiGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - Report
 

thesis_submitted

  • 1. Encapsulation and Abstraction for Modeling and Visualizing Information Uncertainty Alexander Streit Bachelor of Information Technology (Honours) Queensland University of Technology A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy November 2007 Principal Supervisor: Prof. Binh Pham Associate Supervisor: Dr. Ross Brown Faculty of Information Technology Queensland University of Technology Brisbane, Queensland, AUSTRALIA
  • 2. © Copyright by Alexander Streit 2007 All Rights Reserved ii
  • 4. iv
  • 5. Keywords Information Uncertainty Visualization, Information Uncertainty Modeling, Spread- sheets, Visualization Spreadsheets, Uncertainty Visualization Spreadsheets, Visualiza- tion Tools, Modeling Tools, Uncertainty Modeling, Uncertainty Visualization, Proba- bility, Fuzzy Visualization, Visualization Frameworks, Visualization v
  • 6. vi
  • 7. Abstract Information uncertainty is inherent in many real-world problems and adds a layer of complexity to modeling and visualization tasks. This often causes users to ignore uncertainty, especially when it comes to visualization, thereby discarding valuable knowledge. A coherent framework for the modeling and visualization of information uncertainty is needed to address this issue In this work, we have identified four major barriers to the uptake of uncertainty modeling and visualization. Firstly, there are numerous uncertainty modeling tech- niques and users are required to anticipate their uncertainty needs before building their data model. Secondly, parameters of uncertainty tend to be treated at the same level as variables making it easy to introduce avoidable errors. This causes the uncertainty technique to dictate the structure of the data model. Thirdly, propagation of uncertainty information must be manually managed. This requires user expertise, is error prone, and can be tedious. Finally, uncertainty visualization techniques tend to be developed for particular uncertainty types, making them largely incompatible with other forms of uncertainty information. This narrows the choice of visualization techniques and results in a tendency for ad hoc uncertainty visualization. The aim of this thesis is to present an integrated information uncertainty modeling vii
  • 8. and visualization environment that has the following main features: information and its uncertainty are encapsulated into atomic variables, the propagation of uncertainty is automated, and visual mappings are abstracted from the uncertainty information data type. Spreadsheets have previously been shown to be well suited as an approach to visu- alization. In this thesis, we devise a new paradigm extending the traditional spreadsheet to intrinsically support information uncertainty. Our approach is to design a framework that integrates uncertainty modeling tech- niques into a hierarchical order based on levels of detail. The uncertainty information is encapsulated and treated as a unit allowing users to think of their data model in terms of the variables instead of the uncertainty details. The system is intrinsically aware of the encapsulated uncertainty and is therefore able to automatically select appropriate uncertainty propagation methods. A user-objectives based approach to uncertainty visualization is developed to guide the visual mapping of abstracted uncertainty information. Two main abstractions of uncertainty information are explored for the purpose of visual mapping: the Unified Uncertainty Model and the Dual Uncertainty Model. The Unified Uncertainty Model provides a single view of uncertainty for visual mapping, whereas the Dual Uncertainty Model distinguishes between possibilistic and probabilistic views. Such abstractions provide a buffer between the visual mappings and the uncertainty type of the underly- ing data, enabling the user to change the uncertainty detail without causing the visual- ization to fail. Two main case studies are presented. The first case study covers exploratory and forecasting tasks in a business planning context. The second case study inves- tigates sensitivity analysis for financial decision support. Two minor case studies are also included: one to investigate the relevancy visualization objective applied to busi- ness process specifications, and the second to explore the extensibility of the system through General Purpose Graphics Processor Unit (GPGPU) use. A quantitative anal- ysis compares our approach to traditional analytical and numerical spreadsheet-based viii
  • 9. approaches. Two surveys were conducted to gain feedback on the from potential users. The significance of this work is that we reduce barriers to uncertainty modeling and visualization in three ways. Users do not need a mathematical understanding of the uncertainty modeling technique to use it; uncertainty information is easily added, changed, or removed at any stage of the process; and uncertainty visualizations can be built independently of the uncertainty modeling technique. ix
  • 10. x
  • 11. Publications 1. Pham, B. and Streit, A. and Brown, R. “Visualisation of Information Uncertainty: Progress and Challenges,” in Interactive Visualisation: A State-of-the-Art Survey, Elena Zudilova-Seinstra, Tony Adriaansen and Robert van Liere (eds.), 2007, Springer, UK. In Print. 2. Streit, A. and Pham, B. and Brown, R. “A Spreadsheet Approach to Facilitate Visualization of Uncertainty in Information,” IEEE Transactions on Visualization and Computer Graphics, 11 July 2007. IEEE Computer Society Digital Library. IEEE Computer Society, 30 September 2007 <http://doi.ieeecomputersociety.org/10.1109/TVCG.2007.70426> 3. Streit, A. and Pham, B. and Brown, R. Visualisation Support for Managing Large Business Pro- cess Specifications. International Conference on Business Process Management (BPM). Nancy, France, September 6-8, 2005. Lecture Notes in Computer Science, Springer. Acceptance rate: 13% 4. Campbell, A. and Berglund, E. and Streit, A. Graphics Hardware Implementationof the Parameter- Less Self-Organising Map. International Conference on Intelligent Data Engineering and Au- tomated Learning (IDEAL’05). Brisbane, July 6-8, 2005. Pages 343-350. Lecture Notes in Computer Science, Springer. xi
  • 12. xii
  • 13. Acknowledgments This thesis would not have been possible without my principal supervisor, Prof. Binh Pham, and my associate supervisor, Dr. Ross Brown. Both collaborated to teach me their process for completing research projects. Invaluable knowledge for which I am very grateful. I wish especially to thank Fral, who supported me even when it didn’t seem ratio- nal to do so. My mother, Jilli, who should really be receiving this degree herself. I also wish to thank my Honors supervisor, Ruth Christie, who inspired me to pursue postgraduate studies in the first instance. I wish to thank my colleague, Dr. Robert Smith, who provided me with extensive insight and feedback. Alexander Campbell for his many comments and suggestions. Finally, I wish to thank my business associate, Dr. Andy Boud, for acting as an unof- ficial mentor. xiii
  • 14. xiv
  • 15. Abbreviations ASP Analytical Spreadsheet Package BPM Business Process Management DUM Dual Uncertainty Model EBNF Extended Backus-Naur Form GIS Geographic Information Systems GPGPU General Purpose Graphics Processing Unit LIC Line Integral Convolution NaN Not A Number NIST National Institute of Standards and Technology NPV Net Present Value PDF Probability Density Function PMF Probability Mass Function QUM Quad Uncertainty Model SI The Spreadsheet for Images SIV Spreadsheet for Information Visualization xv
  • 16. UML Unified Modeling Language UUM Unified Uncertainty Model VTK The Visualization Toolkit xvi
  • 17. Contents Abstract vii 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Original Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.6 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 7 2 Background 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Information Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1 Sources of Information Uncertainty . . . . . . . . . . . . . . 12 2.2.2 Understanding Information Uncertainty . . . . . . . . . . . . 13 2.2.3 Approaches to Modeling Information Uncertainty . . . . . . . 15 2.3 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 The Sensemaking Process . . . . . . . . . . . . . . . . . . . 21 2.3.2 Visualization Techniques . . . . . . . . . . . . . . . . . . . . 24 2.4 Information Uncertainty Visualization Approaches . . . . . . . . . . 29 2.4.1 Low-level Features . . . . . . . . . . . . . . . . . . . . . . . 30 2.4.2 Higher-level Constructions . . . . . . . . . . . . . . . . . . . 34 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 xvii
  • 18. 3 Framework for Integrated Uncertainty Modeling and Visualization 39 3.1 A New Approach to Information Uncertainty . . . . . . . . . . . . . 39 3.2 Analysis of Issues and Requirements . . . . . . . . . . . . . . . . . . 41 3.2.1 Ad hoc Visualization Techniques . . . . . . . . . . . . . . . . 42 3.2.2 Incoherence of Uncertainty Models . . . . . . . . . . . . . . 45 3.2.3 Artificial Separation of Information and Uncertainty . . . . . 47 3.3 Components of the Framework . . . . . . . . . . . . . . . . . . . . . 48 3.3.1 Spreadsheet Paradigm . . . . . . . . . . . . . . . . . . . . . 49 3.3.2 Uncertainty Encapsulation . . . . . . . . . . . . . . . . . . . 50 3.3.3 Uncertainty Abstraction . . . . . . . . . . . . . . . . . . . . 51 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4 Spreadsheet Paradigm for Information Uncertainty 53 4.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 Related Work on Spreadsheets . . . . . . . . . . . . . . . . . . . . . 55 4.3 Architecture and Features . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.1 Uncertainty Encapsulation . . . . . . . . . . . . . . . . . . . 57 4.3.2 Uncertainty Abstraction . . . . . . . . . . . . . . . . . . . . 60 4.4 New Process and Workflow . . . . . . . . . . . . . . . . . . . . . . . 62 4.5 Capabilities and Advantages . . . . . . . . . . . . . . . . . . . . . . 63 4.6 Case Study: Financial Decision Support . . . . . . . . . . . . . . . . 65 5 Uncertainty Encapsulation and Automated Propagation 71 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.2 Unified Information Uncertainty Framework . . . . . . . . . . . . . . 72 5.2.1 Conceptualizing Information Uncertainty and its Usage . . . . 73 5.2.2 Categorization of Uncertainty Models . . . . . . . . . . . . . 77 5.2.3 Data Structures for Information Uncertainty . . . . . . . . . . 81 5.3 Automated Propagation of Information Uncertainty . . . . . . . . . . 86 5.3.1 Uncertainty Propagation Model . . . . . . . . . . . . . . . . 86 5.3.2 Hierarchical Heterogeneous Propagation . . . . . . . . . . . 88 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6 Uncertainty Abstraction for Visualization 95 6.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . 95 6.2 User-objectives for Information Uncertainty Visualization . . . . . . . 97 6.2.1 Analysis of User-Objectives . . . . . . . . . . . . . . . . . . 98 6.2.2 A Computer Assisted User-Objectives Selection Method . . . 101 6.3 Uncertainty Abstraction Models . . . . . . . . . . . . . . . . . . . . 103 xviii
  • 19. 6.3.1 The Unified Uncertainty Model . . . . . . . . . . . . . . . . 104 6.3.2 The Dual Uncertainty Model . . . . . . . . . . . . . . . . . . 105 6.3.3 Design and Use . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3.4 Alternative Models . . . . . . . . . . . . . . . . . . . . . . . 111 6.4 Case Study: User-Objectives in Financial Decision Support . . . . . . 113 6.5 Case Study: Relevancy Objective in Business Process Management . 123 7 Integration of Core Features 133 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.3.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.3.2 Core Components . . . . . . . . . . . . . . . . . . . . . . . 138 7.3.3 Plugin Components . . . . . . . . . . . . . . . . . . . . . . . 147 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 8 Advanced Features and Extensibility 151 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.2 Advanced Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.2.1 Hierarchical Spreadsheets . . . . . . . . . . . . . . . . . . . 152 8.2.2 Floating Observers and Embedded Visualizations . . . . . . . 157 8.2.3 Customization . . . . . . . . . . . . . . . . . . . . . . . . . 159 8.3 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 8.4 Case Study: GPGPUSheet . . . . . . . . . . . . . . . . . . . . . . . 167 9 Evaluation 171 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 9.2 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 172 9.2.1 Construction Experiments . . . . . . . . . . . . . . . . . . . 175 9.2.2 Retrospection Experiments . . . . . . . . . . . . . . . . . . . 180 9.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 9.3 Sensitivity Analysis Surveys . . . . . . . . . . . . . . . . . . . . . . 185 9.3.1 First Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 9.3.2 Second Survey . . . . . . . . . . . . . . . . . . . . . . . . . 188 9.4 Case Study: Business Planning . . . . . . . . . . . . . . . . . . . . . 191 10 Conclusion and Future Work 199 10.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 10.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 xix
  • 20. 10.3 Possible Applications and Extensions . . . . . . . . . . . . . . . . . 201 Bibliography 203 A First Survey 215 B Second Survey 219 xx
  • 21. List of Figures 2.1 Fuzzy Set for Hot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 Fuzzification for Temperature . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Results of Fuzzy Operations are Shown by the Grey Shaded Regions . 19 2.4 Defuzzification Using an α-cut . . . . . . . . . . . . . . . . . . . . . 20 2.5 Example Rough Set for Containment of a Region . . . . . . . . . . . 21 2.6 The Multi-agent Visualization Support System . . . . . . . . . . . . . 23 2.7 The Visualization Task Network (VTN) Learns Task-oriented Visual- ization Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.8 An Ontology of Visualization . . . . . . . . . . . . . . . . . . . . . . 24 2.9 Visualization Techniques Categorized by the Type of Data to be Visu- alized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.10 Examples of Selected Visualization Techniques . . . . . . . . . . . . 26 2.11 The Model-based Visualization Taxonomy . . . . . . . . . . . . . . . 29 2.12 Relationship between Uncertainty Visualization, Information Uncer- tainty Visualization, Error Visualization, and Fuzzy Visualization . . . 30 2.13 Using Opacity to Show the Structure of Uncertainty. Color Scheme (left), Normal Rendering (center), Uncertainty Structure (right) . . . . 32 2.14 Visual Mappings Showing Difference. From Left to Right: Overlay, Rainbow Mapping, White-black-white Pseudo-coloring, Glyph (Hi- pass), Glyph (low-pass) . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.15 Tip Level Based on the Quality of the Food and Service Using Fuzzy Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.16 Two Frames from a Luminosity Oscillation Animation . . . . . . . . 37 xxi
  • 22. 2.17 Visualization With Probability Density Functions Over Associated Data Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.1 Visualizations of Employment Numbers in California. Years 2005- 2010 are Predicted. (a) Assuming Average Growth (b) Indicating Growth is Estimated (c) Possible Growth (d) Likely Growth. (Data Source: California Employment Development Department) . . . . . . . . . . 44 3.2 Description of the Framework . . . . . . . . . . . . . . . . . . . . . 48 3.3 Additional Layer in Spreadsheet Hierarchy . . . . . . . . . . . . . . 49 3.4 Screenshot of the Prototype System . . . . . . . . . . . . . . . . . . 50 4.1 Basic Cell Type Object Hierarchy . . . . . . . . . . . . . . . . . . . 57 4.2 Novel CellType Object Hierarchy . . . . . . . . . . . . . . . . . . . 58 4.3 Screen-shot of the Prototype . . . . . . . . . . . . . . . . . . . . . . 59 4.4 Visualization Sheet for the Graph in Figure 4.7 . . . . . . . . . . . . 61 4.5 Process for Constructing an Uncertainty Spreadsheet . . . . . . . . . 62 4.6 Interval Modeling Example: (a) Original Model (b) Traditional Spread- sheet (c) Prototype System Uncertainty Hidden (d) Prototype System Uncertainty Shown . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.7 Using An Interval (±0.5) for Annual Change in Interest Rates Propa- gates the Uncertainty to NPV . . . . . . . . . . . . . . . . . . . . . . 69 4.8 Volumetric Representation of the Most Likely Effect Interest Rate Changes Will Have on NPV. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.1 Progression Through States of Information Uncertainty (Boxes) as a Result of Information (Arrows) . . . . . . . . . . . . . . . . . . . . . 74 5.2 Projection of Information Uncertainty onto an Estimate Point . . . . . 75 5.3 All Collapses of the Interval [4.5,5.5] . . . . . . . . . . . . . . . . . 76 5.4 All Collapses of a Fuzzy Number Around 5 . . . . . . . . . . . . . . 76 5.5 Collapses for a Fuzzy Number Around 5, Using a Cut Plane . . . . . 77 5.6 The Interval Data Structure . . . . . . . . . . . . . . . . . . . . . . . 83 5.7 Definition of Continuous Rough Set Using a Marker Sequence . . . . 84 5.8 Definition of Linearly Defined Fuzzy Set Using a Marker Sequence . 85 5.9 Information Uncertainty Modeling Techniques Sorted into Three Strata 89 5.10 Example of Increasing Levels of Uncertainty Information . . . . . . . 90 5.11 Sample Promotion/Demotion Graph . . . . . . . . . . . . . . . . . . 92 6.1 Graphs illustrating the Visual Treatment of Information with Variable Degrees of Uncertainty under Different Objectives. . . . . . . . . . . 98 6.2 Schematic Illustration of the Dual Uncertainty Model . . . . . . . . . 106 xxii
  • 23. 6.3 UML Diagram of the Dual Uncertainty Model . . . . . . . . . . . . . 109 6.4 Illustration of the Quad Uncertainty Model . . . . . . . . . . . . . . . 111 6.5 Example of a Recursive UUM . . . . . . . . . . . . . . . . . . . . . 112 6.6 Possible Effects of House Price Movements on NPV (2D). . . . . . . 115 6.7 Possible Effects of House Price Movements on NPV (3D). . . . . . . 116 6.8 Possible Effects of House Prices and Interest Rates on NPV. . . . . . 117 6.9 Most Likely Profitability Resulting from Changes in Interest Rates . . 118 6.10 Volumetric Representation of the Most Likely Effect Interest Rate Changes Will Have on NPV. . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.11 Likelihood of NPV . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 6.12 Effect of Interest Rate Changes Grouped into 5 Year Periods . . . . . 121 6.13 Optimum Time to Sell the Property Under Different Economic Condi- tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 6.14 Architecture of the Case Study System . . . . . . . . . . . . . . . . . 124 6.15 YAWL query: Prototype Tool for the Graphical Business Specification Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.16 Production Rules for Reducing a YAWL Graph . . . . . . . . . . . . 127 6.17 Original Graph Prior to Simplification . . . . . . . . . . . . . . . . . 130 6.18 Reduced Specification using Collapse Approach (α = 2.5) . . . . . . 131 6.19 Reduced Specification for Text Query “legal” Using Decimation Ap- proach (β = 0.5, α = 1) . . . . . . . . . . . . . . . . . . . . . . . . 131 7.1 Design of the User Interface . . . . . . . . . . . . . . . . . . . . . . 135 7.2 The Spreadsheet Architecture . . . . . . . . . . . . . . . . . . . . . . 135 7.3 High-level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.4 View and Controller Classes for the Spreadsheet . . . . . . . . . . . . 138 7.5 The Core Components . . . . . . . . . . . . . . . . . . . . . . . . . 139 7.6 UML Inheritance Diagram for the Kernel Class . . . . . . . . . . . . 139 7.7 Main Classes in the Datamodel Component . . . . . . . . . . . . . . 141 7.8 Relationship of the Dependency Graph to Cells and CellContainers . . 141 7.9 Cell C is Dependent on A Multiple Times . . . . . . . . . . . . . . . 142 7.10 UML Inheritance Diagram for the DependencyGraph Class . . . . . . 143 7.11 Example Formula and its CodeTree . . . . . . . . . . . . . . . . . . 145 7.12 Formula Language Definition . . . . . . . . . . . . . . . . . . . . . . 146 7.13 UML Diagram for the CodeTree Class . . . . . . . . . . . . . . . . . 146 7.14 UML Inheritance Diagram for the Propagation Models . . . . . . . . 146 7.15 The IPropagationMethod Class . . . . . . . . . . . . . . . . . . . . . 147 7.16 The IPlugin Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 147 7.17 The ICellType Interface . . . . . . . . . . . . . . . . . . . . . . . . . 148 xxiii
  • 24. 7.18 The IDUMMethod Interface . . . . . . . . . . . . . . . . . . . . . . 149 7.19 The UncertaintyRange and UncertaintyRangeSet Classes . . . . . . . 149 7.20 UML Inheritance Diagram for the IVisualElement Interface . . . . . . 150 8.1 Hierarchical Spreadsheet Prototype. The Parent Sheet (Left) Contains the Child Sheet (Right) . . . . . . . . . . . . . . . . . . . . . . . . . 154 8.2 Floating Observer, Observing the Uncertainty Line Graph in Cell D14 158 8.3 Dependency Tree for the Floating Observer from Figure 8.2 . . . . . 159 8.4 CellType List Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 8.5 Propagation Model Editor . . . . . . . . . . . . . . . . . . . . . . . . 162 8.6 Propagation Model, Method, and Model Set . . . . . . . . . . . . . . 163 8.7 Propagation Model Set Editor . . . . . . . . . . . . . . . . . . . . . 163 8.8 Dual Uncertainty Model Selector . . . . . . . . . . . . . . . . . . . . 164 8.9 Prototype system for GPGPU visualization . . . . . . . . . . . . . . 168 9.1 Spreadsheet for First Experiment, Without Uncertainty . . . . . . . . 176 9.2 Spreadsheet for First Experiment, With Uncertainty . . . . . . . . . . 177 9.3 Spreadsheet for First Experiment, Analytical Approach Using Tradi- tional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 9.4 Monte-Carlo Spreadsheet for First Experiment . . . . . . . . . . . . . 178 9.5 Construction Cost Graph . . . . . . . . . . . . . . . . . . . . . . . . 183 9.6 Number of Formulae in Construction Experiments . . . . . . . . . . . 184 9.7 Formula and Layout Changes During Retrospection Experiments . . . 185 9.8 Background of Respondents . . . . . . . . . . . . . . . . . . . . . . 189 9.9 Average Completion Time . . . . . . . . . . . . . . . . . . . . . . . 190 9.10 Questions and Average Responses . . . . . . . . . . . . . . . . . . . 191 9.11 Hierarchical Business Plan Spreadsheet . . . . . . . . . . . . . . . . 193 9.12 Market Share Overview with Embedded Sheets . . . . . . . . . . . . 193 9.13 Market Share for Year 1 . . . . . . . . . . . . . . . . . . . . . . . . . 194 9.14 Target Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 9.15 Typical Break-even Visualization . . . . . . . . . . . . . . . . . . . . 196 9.16 Probabilistic Break-even Visualization . . . . . . . . . . . . . . . . . 197 xxiv
  • 25. List of Tables 2.1 Sources and Causes of Information Uncertainty . . . . . . . . . . . . 13 2.2 Data Stages in the Data State Model . . . . . . . . . . . . . . . . . . 28 2.3 Transformation Operators in the Data State Model . . . . . . . . . . . 28 4.1 Format of Cells in the Prototype System . . . . . . . . . . . . . . . . 58 4.2 Prototype Uncertainty Interrogation Functions . . . . . . . . . . . . . 61 5.1 Predicted Growth Rates used in Figure 3.1 . . . . . . . . . . . . . . . 73 5.2 Categories of Information Uncertainty Modeling Techniques . . . . . 78 5.3 Common Information Uncertainty Modeling Types . . . . . . . . . . 82 6.1 Information Uncertainty Visualization Objectives . . . . . . . . . . . 97 6.2 Questions Used to Elicit the User-Objective . . . . . . . . . . . . . . 103 8.1 Comparison of IntervalCell and SpreadsheetCell . . . . . . . . . . . . 155 8.2 Examples from the Prototype Addressing Scheme . . . . . . . . . . . 155 8.3 Novel Cell Types for GPGPU . . . . . . . . . . . . . . . . . . . . . . 167 8.4 Novel Functions for GPGPU . . . . . . . . . . . . . . . . . . . . . . 169 9.1 A Selection of Normal Probability Functions in Microsoft Excel 2003 172 9.2 Actions of the User . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 9.3 Retrospection Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 9.4 Respondents to the Survey . . . . . . . . . . . . . . . . . . . . . . . 185 xxv
  • 26. xxvi
  • 27. Statement of Original Authorship The work contained in this thesis has not been previously submitted for a degree or diploma at any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously pub- lished or written by another person except where due reference is made. Signature: Alexander Streit Date: xxvii
  • 29. CHAPTER 1 Introduction 1.1 Motivation The term information uncertainty refers to vagueness, imprecision, fuzziness, likeli- hood, and related uncertainty as it is present in information. Many problems are subject to information uncertainty and, in response, numerous techniques have been developed to model this uncertainty. Modeling information uncertainty not only provides greater confidence in results, but can also give an indication of how much confidence to place in the results. While visualization is a popular tool, information uncertainty visualiza- tion is far less widespread. In this work we have identified four major barriers to the uptake of information uncertainty modeling and visualization. Firstly, there are numerous information uncer- tainty modeling techniques, each of which are treated differently. This forces users to anticipate their information uncertainty needs before building their data model. Sec- ondly, parameters of the uncertainty space tend to be treated at the same level as vari- ables, which makes it easier to introduce avoidable errors and causes the information
  • 30. 2 Chapter 1. Introduction uncertainty modeling technique to dictate the structure of the user’s model. Thirdly, propagation of uncertainty information must be manually managed by the user, which requires expertise, is error prone, and can be tedious. Fourthly, uncertainty visualiza- tion techniques tend to be developed for particular information uncertainty types and they are largely incompatible with other forms of uncertainty information. This nar- rows the selection of visualization techniques available and results in a tendency for ad hoc information uncertainty visualization techniques. Information uncertainty modeling makes it more difficult to manage the data model due to increased information. Furthermore, it is common that a chosen uncertainty modeling technique will subsequently need to be changed, since knowledge about the uncertainty changes as more information becomes available. This is currently a diffi- cult and error prone process. Visualization of information uncertainty poses its own unique challenges. Existing visualization techniques may not be appropriate for uncertainty information and there are issues with information overloading and interpretatability of results. On a practical level, there is a lack of tools that are conducive to visualizing information uncertainty. To ease the burden of managing the information overload in modeling and visual- ization requires an integrated system that covers the entire workflow cycle from data acquisition to visualization. Tools are also needed to help users with higher-level tasks such as selection of modeling and propagation options, and organization and compar- ison of visual mappings. More specifically, the architecture should support automated uncertainty propagation and allow easy switching between different uncertainty mod- els, and different methods of display. Spreadsheets are often used to perform uncertainty based analysis and they have previously been shown to be well suited as an approach to visualization. However, the benefit of a spreadsheet approach to uncertainty modeling and visualization has not yet been explored. This thesis extends the spreadsheet paradigm to support information uncertainty modeling and visualization in an integrated whole.
  • 31. 1.2 Aims 3 1.2 Aims The overall aim for this thesis is to devise an integrated information uncertainty mod- eling and visualization environment that has the following features: Hierarchical structure: The system should differentiate between levels of detail in the data model. Uncertainty information is of a lower level of detail than the variables. Reduce data-type lock-in: If a data model is constructed using particular informa- tion uncertainty modeling techniques, the cost to change to another modeling technique should be minimized. Adaptive: Information about the uncertainty space of a variable should be easy to add, change, or remove at any stage of the modeling and visualization process. Seamless integration of information and its uncertainty: There should not be an ar- tificial separation between the information and its uncertainty. Simplify information uncertainty modeling: Users should not be required to have an intimate understanding of the modeling technique mechanics in order to use it. Automate propagation: Uncertainty information needs to be propagated and the sys- tem should carry this out automatically. Less error prone: The system should reduce the potential for user induced errors. Flexible: Users should be able to map uncertainty information into alternative models and visual features so that they can explore the impacts of different modeling and visualization techniques. Robust: When the uncertainty information changes, the existing data model and vi- sualizations should continue to function correctly.
  • 32. 4 Chapter 1. Introduction Extensible: There are numerous information uncertainty modeling techniques and the design of the system should allow for more to be added. In order to achieve this aim, the following tasks are performed: • Examine the field to determine the current state of play, covering information uncertainty modeling techniques, visualization processes and practices, and un- certainty visualization; • Design an integrated information uncertainty modeling and visualization frame- work • Investigate how the spreadsheet paradigm can be extended to intrinsically sup- port information uncertainty modeling and visualization; • Explore uncertainty encapsulation as an approach to semantic association of in- formation and its uncertainty; • Develop an automated propagation mechanism and a method for resolving un- usual modeling technique combinations; • Design uncertainty abstractions that enable visualization mappings to be data- type independent; • Explore the user-objectives approach as a means for defining visualization char- acteristics; • Conduct a quantitative analysis comparing the cost of our approach to existing methods; • Analyze feedback from potential users; • Conduct case studies on financial decision support and business planning to es- tablish the viability of the spreadsheet for commercial uses;
  • 33. 1.3 Scope 5 • Investigate of the capability for the architecture to be applied to non-uncertainty uses through a case study; and • Draw conclusions and make recommendations for future work. 1.3 Scope This thesis deals with information uncertainty, which is uncertainty about the true value of a unit of information. The intrinsic connection between uncertainty and information is the basis for our encapsulation approach, which underpins the automatic propagation and visualization-oriented uncertainty abstraction. However, there exist several other forms of uncertainty, such as uncertainty arising from interpretation, for which the encapsulation approach may not be suitable. The methods presented in this thesis is limited to those forms of uncertainty that can be parametrized in some quantifiable way. Modeling of uncertainty has its foundation in mathematics. This project is con- cerned with the frameworks, approaches, and methods for applying these modeling techniques. As such, mathematical issues will be touched on, however, detailed cov- erage of mathematical models is beyond the scope of this work and it is assumed that users will use the mathematical techniques appropriate to their problem. 1.4 Original Contribution Current investigations into information uncertainty visualization have focused on vi- sualization techniques for particular information uncertainty data types. We approach the problem of information uncertainty visualization holistically, from modeling and automated propagation through to user-objectives in visualization. We produce an integrated information uncertainty modeling and visualization frame- work and design the information uncertainty visualization spreadsheet, which intrin- sically support information uncertainty modeling, automated uncertainty propagation, and uncertainty model abstracted visualization.
  • 34. 6 Chapter 1. Introduction To achieve this we extend the spreadsheet paradigm to incorporate information un- certainty and visualization features. This requires a number of components. Firstly, our encapsulation of uncertainty information approach semantically links the infor- mation to its uncertainty. Secondly, we introduce the uncertainty propagation model to manage the mechanics of propagating uncertainty, including operations involving mixed data-type parameters. Thirdly, we present hierarchical heterogeneous propaga- tion, which automatically determines suitable combinations of the available methods to ensure that the propagation can be achieved. Fourthly, we produce uncertainty ab- straction models, which abstract the uncertainty information for visual mapping in vi- sualizations by providing a common plural value type. Fifthly, we incorporate flexible visualization capabilities into the spreadsheet using a visualization sheet. Abstraction from the information uncertainty data type means that traditional data- type specific visual mapping criteria may no longer be applicable, leaving a gap in the knowledge. To address this, we investigate user-objectives for information uncertainty visualization, which describe the characteristics of uncertainty space that the user is seeking to visualize. User-objectives provide a data-type abstracted means of describ- ing, executing, and evaluating visualizations. 1.5 Significance The significance of this work is that it provides the means for intuitive and non- intrusive environment for modeling and visualizing information uncertainty. This has three major effects. Firstly, access to information uncertainty visualization is designed into the system from the outset and it does not require user expertise in uncertainty techniques to manage information uncertainty. Secondly, uncertainty information is easily added, changed, or removed at any stage of the process. Thirdly, information uncertainty visualizations can be built independently of the modeling technique, pro- viding a coherent foundation for the development of visualization techniques while reducing their tendency to be ad hoc.
  • 35. 1.6 Organization of the Thesis 7 Information uncertainty is a problem in many fields. Overcoming barriers to its modeling and visualization is an important step in managing a difficult problem. 1.6 Organization of the Thesis The organization of this thesis is as follows. Chapter 2 introduces background mate- rial on uncertainty modeling techniques, visualization techniques, and what has been done to visualize uncertainty. Chapter 3 describes the framework that integrates infor- mation uncertainty modeling and visualization tasks together into a coherent whole. Chapters 4 through 6 cover the components of the framework: Chapter 4 elaborates on the spreadsheet paradigm as a mechanism for integrating and managing these tasks. Chapter 5 investigates the encapsulation approach to information uncertainty, which includes the unified hierarchy and automated propagation. Chapter 6 explores the ab- straction approach, which includes uncertainty abstraction models and user-objectives for visualization. Chapter 7 integrates the components into a core system, covering the requirements, design, and architecture. Chapter 8 considers advanced features and extensibility of the system. Chapter 9 presents the evaluations of the system, with a comparative analysis of different approaches, a discussion of a survey, and a case study in business planning. Chapter 10 provides a conclusion and points to future work.
  • 36. 8 Chapter 1. Introduction
  • 37. CHAPTER 2 Background “As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.” – Albert Einstein1 2.1 Introduction Information uncertainty is a complex subject that is inherent in many real-world prob- lems. The uncertainty comes from different sources and can be interpreted and mod- eled in various ways. There are often subtle interactions between variables and uncer- tainty, which can be difficult to understand. Visualization of information uncertainty presents an opportunity to provide deeper insights into the nature of the information, its uncertainty, and the impact it has on outcomes. However, the difficulty of adopt- ing information uncertainty and the lack of visualization tool support has caused many practitioners to ignore the uncertainty completely or to ignore situations where the 1In J. R. Newman (ed.) The World of Mathematics, New York: Simon and Schuster, 1956
  • 38. 10 Chapter 2. Background uncertainty is deemed too high. This practice results in valuable knowledge being dis- carded and reduced quality in outcomes, or worse, can even result in entirely wrong outcomes. There are aspects of information uncertainty that have been given considerable at- tention, particularly in the field of mathematics. Two aspects have been especially well developed: the first includes the various mathematical models that exist for repre- senting, measuring, and recording uncertainty. The second aspect is the collection of rules and techniques for propagating, estimating, and minimizing information uncer- tainty. These models and techniques range from the statistical methods and probabil- ities through to fuzzy models. Research into visualization of information uncertainty has only been carried out sporadically during the last decade. Earlier work has focused on a data-driven approach, with visual data representations for particular data types or responding to the needs of specific applications. More recent work has investigated task-based approaches and sought to integrate higher-level issues, such as software architectures and frameworks for visualization systems. The aim of this chapter is to provide background for understanding information un- certainty modeling and visualization by examining relevant works and identifying key issues. The chapter is organized as follows. Section 2.2 describes information uncer- tainty in general, covering sources of information uncertainty, understanding informa- tion uncertainty and its usage, and information uncertainty modeling techniques. Sec- tion 2.3 discusses relevant issues in visualization, focusing on the process and sense- making cycle, and visualization techniques. Section 2.4 examines current progress and key techniques in information uncertainty visualization. A summary of this chapter is given in Section 2.5. 2.2 Information Uncertainty In many circumstances the true value of a variable is not fully known, giving rise to information uncertainty. The information that is known about the variable can be
  • 39. 2.2 Information Uncertainty 11 stored and this technique is referred to as information uncertainty modeling. As an example, information uncertainty modeling can be used to aid analysis of the potential environmental impact of a new road. Data is required about the type, amount, and distribution of vegetation; the variety, location, and habits of local animals; and how all of these interact. The data that is collected will only be accurate to a certain level of precision, which can be modeled. Further, much of the information derived from expert knowledge will be qualitative in nature and thus dependent on interpretation. It is already a significant task to understand the structure, characteristics, trends, and interdependency of data. However, information uncertainty serves to complicate things even further as it requires an understanding of the propagation of uncertainty, the potential for variation in outcomes, and impacts due to changes in the level of information uncertainty. Effective visualization of the information and its uncertainty can help to overcome this problem. Historically, uncertainty had been regarded as an undesirable factor that is to be avoided. Only in the 20th century has it become a fundamental component of sci- ence [62]. However, the term uncertainty itself can vary depending on the author and the field. For example, Hunter and Goodchild, dealing with spatial databases, reserve the term uncertainty to refer exclusively to unknown inaccuracy and instead use the term error for objectively known inaccuracy [47]. Pang et al. use the term uncer- tainty to cover three categories [86]: statistical, including probabilistic and confidence methods; error, which refers to differences between estimates and actual values; and range, which covers intervals of possible values. Klir [58, 60] and Gershon [33] offer a more general definition of uncertainty as some deficiency in information, and from there define a measure of information in terms of reduction in uncertainty. Standards and guidelines have been developed for the management of uncertainty in measure- ment. One such guide by the National Institute of Standards and Technology (NIST) describes measurements as approximations and contends that “the result is complete only when accompanied by a quantitative statement of its uncertainty” [110, pp. 1]. A
  • 40. 12 Chapter 2. Background similar guide that was issued for analytical chemistry by Eurachem defines measure- ment uncertainty as a parameter “that characterizes the dispersion of the values that could reasonably be attributed to the measurand” [29, pp. 4]. The common theme is that uncertainty can be characterized for a particular unit of information, and we use the term information uncertainty to refer to situations where this condition holds. 2.2.1 Sources of Information Uncertainty Pang et al. [86] investigated uncertainty visualization and categorized sources of in- formation uncertainty based on the point of the visualization process in which it is introduced. The resulting three categories are acquisition, where information uncer- tainty is introduced from the measurements and models; transformation, introduced during the information processing step for visualization; and visualization, referring to the uncertainty introduced through the act of the visualization itself. These categories are helpful in characterizing the introduced uncertainty for visualization, but lack gran- ularity in describing the reason for the uncertainty. Thomson et al. [112] focused on the tasks of information analysts in the field and used their descriptive terms to derive a categorization for uncertainty in geospatially referenced information. Information uncertainty can arise due to a number of reasons. Whenever predic- tions are made, they are uncertain. Errors and imprecision in measurement are another common source. The Eurochem guide lists eleven sources of measurement uncer- tainty [29], but is careful to point out that these may not necessarily be independent. While their list includes “operator effects” to cover human introduced uncertainty, the sources are mostly concerned with acts of measurement. Pham and Brown [90] pro- vide a categorization of uncertainty into three categories: factual, pseudo-measurement and pseudo-numerical, and perceptual-based. Factual information is numerical and measurement-based. Pseudo-measurement and pseudo-numerical information are nu- meric approximations. Perceptual-based information is typically linguistic, but can also be image- or sound-based. Table 2.1 lists typical sources of information un- certainty and examples of causes (from [91]). Earlier work by Reznik and Pham
  • 41. 2.2 Information Uncertainty 13 matched nine similar categories of uncertainty sources to uncertainty modeling tech- niques [100]. Sources of information uncertainty Causes Limited accuracy Limitation in measuring instruments, or computational processes, or standards. Missing data Physical limitation of experiments; limited sample size or non-representative sample. Incomplete definition Impossibility or difficulty in articulating exact functional relationships or rules. Inconsistency Conflicts arisen from multiple sources or models. Imperfect realisation of a definition Physical or conceptual limitation. Inadequate knowledge about the effects of the change in environment Model does not cover all influence factors; or is made under slightly different conditions; or is based on the views of different experts. Personal bias Differences in individual perception Ambiguity in linguistic descriptions A word may have many meanings; or a state may be described by many words. Approximation or assumptions embedded in model design methods or procedures Requirements or limitations of models or methods. Table 2.1: Sources and Causes of Information Uncertainty 2.2.2 Understanding Information Uncertainty The search for truth is a goal of science and the presence of uncertainty can imply a deficiency in our understanding. This explains why throughout most of recorded history scientific thought has sought to avoid uncertainty2. However, attitudes toward uncertainty have begun to shift, partly due to discoveries such as the Heisenberg un- certainty principle. Today, uncertainty is viewed as an intrinsic property of problems in most fields. For example, Couclelis noted that considerable effort had been devoted to fighting uncertainty in Geographic Information Systems (GIS), but that there are 2The interested reader is directed to Appendix A of [2] for history of perspectives on knowledge.
  • 42. 14 Chapter 2. Background many things that cannot be known, and the inability to know was not due to human limitation [19]. Many disciplines use information uncertainty modeling techniques to manage un- certainty. The incorporation of information uncertainty techniques enables practition- ers to describe and quantify the uncertainty space. Uncertainty information at the inputs can then be propagated through the model to the outputs. The output can now provide additional information. For example, how much confidence we should place in the result, what alternatives the result may have, and others depending on the modeling technique and the inputs. A recent use for information uncertainty techniques is to simplify systems by re- moving less important information. This mirrors human reasoning, where we reserve detail for items of interest. For example, when ascertaining whether to jump out of the way of a moving vehicle, a rough estimate of the vehicle’s velocity is usually sufficient to determine the appropriate action [77] and precise knowledge of the actual velocity is usually not necessary. There are two main approaches to information uncertainty modeling and propaga- tion. The first approach is to use analytical techniques, which require an understanding of mathematical principles involved. The second approach is to use numerical tech- niques, such as Monte-Carlo simulation. Sometimes numerical techniques are used implicitly without the user realizing, usually by manually varying the inputs and ob- serving their effects. Uncertainty is so intrinsic to information that Klir [62, 63, 58, 61, 59, 60] has been working on a generalized information theory, which has the aims of incorporating uncertainty and information into a unified theory. Their approach is to conceive of uncertainty-based information as being the result of a reduction in uncertainty. As a result of some action, the a priori uncertainty U1 becomes the a posteriori uncertainty U2 and the information derived from this action is therefore given by U1 −U2 [59]. Using information uncertainty modeling techniques not only provides greater con- fidence in results, but can also give an indication of how much confidence to place in
  • 43. 2.2 Information Uncertainty 15 the result. 2.2.3 Approaches to Modeling Information Uncertainty There are numerous uncertainty modeling techniques and we will describe several of the major ones here. Since the sources and causes of uncertainty are different, various mathematical models have been developed to faithfully represent different types of information. We summarize these models into four common types: Probability denotes the likelihood of an event to occur or a positive match. Prob- ability theory provides the foundation for statistical inference (e.g. Bayesian methods) [1, 5]. Possibility provides alternative matches, e.g. a range of errors in measurement [2]. Provability is a measure of the ability to express a situation where the probability of a positive match is exactly one. Provability is the central theme of techniques such as Dempster-Shafer calculus. Membership denotes the degree of match and allows for partially positive matches, e.g. fuzzy sets [3, 6], rough sets [4]. Probability theory models uncertainty in terms of anticipation: the expectation that an outcome will eventuate is characterized by a probability. Classical probability theory describes the ratio between favorable and indifferent outcomes, which has several shortcomings. This led to the development of frequentest probability theory, which defines the chance of a given result under random conditions. Thus, in a repeated experiment the probability of an event will tend toward the ratio between the number of times it occurs to the number of times the experiment was run: Pr(x) = x x+x Where Pr : X → [0,1] is the probability function, x ∈ X is the event, and x = X −{x} is all other outcomes. A probability distribution completely describes the expected
  • 44. 16 Chapter 2. Background outcomes of a random variable. For real-valued random variables, the probability dis- tribution can be defined by F(x) = ∑ xi≤x Pr(xi) for discrete probabilities and F(x) = x −∞ f(t)dt for continuous probabilities, where f is the probability density function. A proba- bility density function (PDF) is effectively a histogram of expected outcomes, with a scale such that the integral is unity, f(t)dt = 1. Probability distributions can take any form, however several well studied distri- butions exist. Of these, the two most commonly used distributions are the uniform distribution and the normal distribution. The uniform distribution assigns every out- come an equal probability. The normal distribution3 has a PDF of F(x) = 1 σ √ 2π e − (x−μ)2 2σ2 where μ is the mean and σ is the standard deviation. Normal distributions find common use because the sum of many independent random variables will approximate a normal distribution. Monte-Carlo simulation is a numerical approach to uncertainty that uses probabil- ity distributions [76, 3]. Input variables are assigned a probability distribution, com- monly a uniform or normal distribution. Numerous random instances are chosen for the inputs according to these distributions, and the outputs that are calculated can then be used to characterize the outputs of the system. The frequentest view of probability models an expectation based purely on fre- quency of events. An alternative view is Bayesian probability theory, which has found widespread use in fields such as machine learning and computer vision [89], and 3also known as the Gaussian distribution after Gauss
  • 45. 2.2 Information Uncertainty 17 econometrics [35]. The mathematician Thomas Bayes introduced a theorem that was generalized by Laplace but is still referred to as Bayes’ Theorem. The theorem relates the conditional and marginal probability of events of two random variables, x and y: Pr(x|y) = Pr(y|x)Pr(x) Pr(y) Bayes’ theorem enabled a new philosophical view of probabilities as modeling belief, using what is called Bayesian inference. Thus, our expectation of an event can be revised: Pr(x|y) is the posterior probability, our revised expectation of event x given evidence y; Pr(y|x) is the conditional probability of seeing y given the hypothesis that x is true; Pr(x) is the prior probability of x; and Pr(y) is the marginal probability of y, whether or not x is true. Probabilities, whether frequentest or Bayesian, are not the only means of describ- ing uncertainty. Classical sets are a means of defining uncertainty. For example, the diagnosis made by a medical doctor could be that a patient suffers from the flu or the cold. In this situation there is a possibility that either (or both) be true and there is un- certainty about which it is. Lotfi Zadeh, in his seminal 1965 paper, proposed fuzzy sets, which along with fuzzy logic enable human-like reasoning using partial truth4. A fuzzy set is a set where each member can be assigned a truth value. This can be expressed as a fuzzy membership function:μ : X → [0,1], where 0 indicates not a member, 1 indicates definitely a member, and all numbers in between indicate a partial membership. A good example of fuzzy sets is given by Mendel in [77]. College students were asked to rank words such as “somewhat” and “quite a bit” against a numerical scale of quantity. Although individual answers varied significantly, a clear ordering emerged and Mendel was able to produce a mapping between these words and their indication of quantity. From there fuzzy sets can be constructed, such as the set lots, which capture the degree to which numeric values can be representations for each of the words. 4Zadeh was not the first to investigate partial truth and interested readers are directed to the works of Łukasiewicz [71]
  • 46. 18 Chapter 2. Background For example, assuming that 24◦C is normal room temperature and 30◦C is com- pletely hot, temperatures between 24◦C and 30◦C are partially hot. The set of hot temperatures might therefore be given by the following membership function, illus- trated graphically in Figure 2.1: μHot(x) = ⎧ ⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ 1 x > 30 x−24 30−24 24 ≤ x ≤ 30 0 x < 24 20 22 24 26 28 30 32 1 0 Membership Temperature Hot Figure 2.1: Fuzzy Set for Hot Fuzzy logic is the inference counterpart for fuzzy sets. A complete fuzzy logic sys- tem consists of a fuzzifier, inference engine, and defuzzifier [77]. The fuzzifier maps numeric values to partial memberships of fuzzy sets. The inference engine executes operations and rules on fuzzified information. The defuzzifier converts a fuzzy repre- sentation into a bi-valued representation, typically using an α-cut. The graph in Figure 2.2 shows an example fuzzifier that maps room temperatures to the fuzzy sets Cold, Normal, and Hot. A temperature of 25.5◦C will be wholly outside the set Cold, mostly within the set Normal, and to a lesser extent partially within the set Hot. Methods are defined for several operations including set fuzzy AND (intersection), OR (union), and NOT (complement)5. Traditional fuzzy logic, also called Zadeh fuzzy logic after its inventor, is shown graphically in Figure 2.3. Given two fuzzy variables, 5Implementations for operators can vary depending on application and interested reader is directed to [77]
  • 47. 2.2 Information Uncertainty 19 1 0 Cold Normal Hot 16 20 24 28 32 Temperature Figure 2.2: Fuzzification for Temperature a and b: a∪b = aORb = min(μ(a),μ(b)) a∩b = aANDb = max(μ(a),μ(b)) ¬a = NOT a = 1− μ(a) Set A Set B A AND BA OR B NOT A 1 0 1 0 1 0 1 0 1 0 Figure 2.3: Results of Fuzzy Operations are Shown by the Grey Shaded Regions Thus rules can be established using constructs such as D = A ∧ B ∨ ¬C, where A, B, C, and D are fuzzy sets.
  • 48. 20 Chapter 2. Background The defuzzifier typically uses an α-cut, which is a mechanism to translate the fuzzy output into traditional bi-valued truth, most typically: μ (d) = ⎧ ⎪⎪⎨ ⎪⎪⎩ 1 μ(d) ≥ α 0 otherwise where t ∈ [0,1] and α is called the “α-cut plane” and is in the range [0,1]. An example of defuzzification with an α-cut of 0.5 is given graphically in Figure 2.4. As the graph shows, the set of normal temperatures is mapped to the interval [21.5,26.5]. 1 0 N o r m a l 1 6 2 0 2 4 2 8 3 2 T e m p e r a t u r e ( º C ) alpha cut = 0.5 Figure 2.4: Defuzzification Using an α-cut A popular representation for uncertainty is the rough set [95]. Rough sets extend classical sets to allow an element to be both inside and outside the set. Thus there are three modes: inside, outside, and both. Three operations are defined that translate to a classical set: the upper limit, the lower limit, and the boundary. The upper limit in- cludes all items that are wholly inside, or both inside and out. The lower limit includes only those items that are inside the set. The boundary includes only items that are both inside and outside the set. This is illustrated graphically in Figure 2.5. An example application of rough sets is in classifying customer details: the rough set information provided will contain a customer if all information has been provided, not contain a customer if no information is provided, and be in both states if some information is provided but some is missing. The company can send letters requesting information to
  • 49. 2.3 Visualization 21 ¬LOWER(c) where c is the “information provided” rough set. Boundary Lower Figure 2.5: Example Rough Set for Containment of a Region Another common classical set-based uncertainty modeling technique is the inter- val. Intervals define the upper and lower boundaries on a continuum, most commonly R. The boundaries themselves can be inclusive, which is indicated using square brack- ets; or exclusive, indicated using rounded brackets. Thus, [0,1) includes zero but ex- cludes one. Interval arithmetic defines the propagation of uncertainty under common arithmetic operators. For example, addition is defined as: [a,b]+[c,d] = [a+c,b+d] 2.3 Visualization 2.3.1 The Sensemaking Process The user has a visualization objective. To achieve this objective they will use a visual- ization technique. The technique transforms information and displays it according to the parameters of the technique. To reach their objective, users will typically iteratively adjust the parameters and view the results, repeating as often as necessary. There are three general classes of user objectives, which can also be described as visualization phases [56]:
  • 50. 22 Chapter 2. Background 1. Exploration, searching the data for relations and patterns 2. Analysis, exploring known relations 3. Presentation, preparing the visualization to communicate information to others Visualization requires an iteration of choose −inspect −view −ad just, which can be cumbersome, particularly for novice visualization users. Several studies have there- fore considered improving the user experience while going through the visualization process [21, 49, 92, 11]. One study sought to encode the visualization exploration process using an XML- based language [49]. The encoding captures the parameters used for each iteration of the loop. By defining a parameter derivation calculus, the results of several visualiza- tion sessions can then be visualized. Such visualizations of visualization sessions are designed to aid the user in understanding the progression of their use of the system. Although the work seeks to formalize the visualization process, it does not improve the process and is limited to modulating parameters of a particular visualization technique. A significant drawback of the work in [49] is that the ability to change to another type of representation or selection of alternate data are not included in the model. Visualization is a tool and not an objective as of itself. However, it has been ob- served (e.g. [73]) that some visualizations are good for publications, tending to be colorful and showy images, but not informative or applicable to real-world problem- solving. Ma [73] argues that scientists need to be involved in evaluating the effec- tiveness of visualization methods, and suggests working with users from application domains both to devise the requirements of the visualization and to subsequently eval- uate the techniques through case studies. Other suggestions for overcoming these obstacles is for visualization to be task- driven instead of data-driven [92, 11]. One approach to this is through an agent-based framework [92] (see Figure 2.6), where a profile agent observes the user’s choice of visualizations and adjusts the systems behavior to improve workflow.
  • 51. 2.3 Visualization 23 Figure 2.6: The Multi-agent Visualization Support System Another proposal is the “Visualization Task Network (VTN)” [11] (see Figure 2.7, from [11, pp.603]), which can learn the requirements of the user. A VTN is a task- oriented approach, where the user first selects the task to be achieved. For each chosen task a set of techniques are proposed by the system. Once a technique is chosen, a list of attributes (e.g. glyph information, grid spacing, and color) is presented. These parameters are similar to those in the work of Jankun-Kelly et al. [49]. Each time the user selects a {task,technique,attribute} set for visualization, the system can increase the weight of that combination. When the user selects a task, the techniques with the highest weighting are shown first. Similarly, once a task and technique are chosen, the attributes with the highest weighting are shown first. One approach to mapping visual features to visualization techniques takes an objective- oriented viewpoint, and is derived from the visualization data ontology outlined in [11]. The mapping begins with the choice of data attributes to be represented: relation- ships, resemblances, order and proportion [34]. These attributes are then mapped to visual features depending on the visualization task to be performed. The visualization task is chosen to enhance the perception of the information required by the viewer for their specific objective in performing the data analysis. The knowledge required for such a task-oriented approach is encapsulated in an agent-based visualization architec- ture [11]. A workshop that was held [27] established the visualization ontology represented in Figure 2.8 (adapted from [27]). Development of the ontology involved investigation
  • 52. 24 Chapter 2. Background Figure 2.7: The Visualization Task Network (VTN) Learns Task-oriented Visualization Parameters of visualization from multiple perspectives. The result produces a clear anatomy of visualization, except that it is missing one vital part: the role of the user. The user plays an integral role in the visualization process, driving parameters and the visual- ization tasks. By excluding the user from the ontology the authors have neglected to consider not only usability and cultural issues, but also opportunities such as adaptive visualization systems. is aboutuses Representation Data Visualisation Task Transformation supported-by input to output from Visual Haptic Isosurface Technique is realised through Key A B A is-a B A B p A property p between concepts A and B Elided hierarchy Figure 2.8: An Ontology of Visualization 2.3.2 Visualization Techniques The topic of visualization is traditionally introduced through reference to a taxonomy of visualization techniques [41, 98, 15, 16]. SIGGRAPH’s visualization education
  • 53. 2.3 Visualization 25 program [41] introduces visualization techniques through a data-type based classifica- tion, which is reproduced in Figure 2.9. Examples of selected techniques are given in Figure 2.10. The limitation of this classification is that it deals only with continuous ordinal values. Visualizations for other types of data, such as trees, are not included. Figure 2.9: Visualization Techniques Categorized by the Type of Data to be Visualized Shneiderman [106] recognized the lack of trees and network graphs and addressed this by including non-ordinal types in their taxonomy. However, the taxonomy itself continues to be based on the data-type being visualized. The data-types identified are [106, pp. 337-339]:
  • 54. 26 Chapter 2. Background (a) 2D Line based Contouring (b) 2D Histogram (c) 3D Streamlines Figure 2.10: Examples of Selected Visualization Techniques • 1-Dimensional, such as textual documents, program source code, and alphabeti- cal lists of names. • 2-Dimensional, such as geographic maps, floor plans, and newspaper layouts. • 3-Dimensional, real world objects such as molecules, the human body, and build- ings. • Temporal, such as medical records, project management, or hierarchical presen- tations to create a data-type that is separate from 1-dimensional data. • Multi-dimensional, such as records in relational databases. • Trees, which are hierarchies where each item, except the root item, has a link to its parent. • Networks, which represent relations that cannot be captured as trees. OLIVE [98] is an online catalog of visualization systems categorized according to this taxonomy, although at the time of writing it is only current up to 1997. While Shneiderman’s taxonomy covers a wider range of visualizations, not all visualization systems fit conveniently. For example, visualizations that present temporally ordered 3-Dimensional data could fit into either the temporal- or the 3-Dimensional categories. To overcome these inconsistencies Card and Mackinlay [15] offer a classification based on additional factors that need to be considered during visualization. Their analysis of
  • 55. 2.3 Visualization 27 visualization systems considers not only the type of data, but also the filtering functions applied to them, the controlled (text) and automatic (glyph) processing techniques, the viewing transformations, and the user interaction elements for every variable in the visualization. Data types are classified as [15, pp. 92-93]: • Nominal, meaning they are only equal or unequal to other values. • Ordinal, meaning they obey a less-than relation. • Quantitative, meaning it is possible to do arithmetic on them. • Intrinsically spatial, which are the subset of quantitative types that represent spa- tial points. • Geographical, which are the subset of intrinsically spatial types that represent geographic locations. • A Set mapped to itself, which is the case in graphs and trees. This taxonomy is cumbersome for purpose of categorizing visualization techniques. Unlike the preceding taxonomies, there is no single category for a visualization tech- nique. Instead, each variable used in the visualization is decomposed according to twelve factors and presented in a matrix. The matrices of two techniques can be com- pared to pinpoint the exact differences between them. The intent of the authors is not only to describe the differences in visualization techniques, but also to suggest new possibilities for visualization techniques [15, pp. 92]. Chi [16] provides a taxonomy to help implementers understand how to implement visualization techniques. The proposed taxonomy is based on their earlier work on the Data State Reference Model [18]. A visualization technique is broken down into four stages according to the state of the data, as shown in Table 2.2, and the data transformation operators that transform the data from one stage to another, as listed in Table 2.3.
  • 56. 28 Chapter 2. Background Stage Description Value The raw data. Analytical Abstraction Data about data, or information, a.k.a. meta-data Visualization Abstraction Information that is visualizable on the screen using a visualization technique View The end-product of the visualization map- ping, where the user sees and interprets the picture presented to her Table 2.2: Data Stages in the Data State Model Processing Step Description Data Transformation Generates some form of analytical ab- straction from the value (usually by ex- traction). Visualization Transformation Takes an analytical abstraction and fur- ther reduces it into some form of visual- ization abstraction, which is visualizable content. Visual Mapping Transforma- tion Takes information that is in a visualizable format and presents a graphical view. Table 2.3: Transformation Operators in the Data State Model Tory and Möller [113] argue that these taxonomies are vague because of the termi- nology used. As an example they cite the use of the word “often” in Card and Mackin- lay’s definition [113, pp.1]. To reduce this ambiguity, their taxonomy is based on the data model rather than the type of data itself. A data model is a representation of data that may include structure, attributes, relationships, and the data values themselves. Visualization algorithms create visual representation of data using a data model. The taxonomy is outlined in Figure 2.11. The visualization algorithms are first classified as continuous or discrete. Scientific visualization corresponds largely to continuous models while information visualization corresponds largely to discrete models. Unlike Card and Mackinlay’s taxonomy, the model-based taxonomy maintains scalar, vector, and tensor categories for dependent variables. Additionally the taxon- omy shows greater flexibility than Shneiderman’s taxonomy, categorizing temporally ordered 3D data as nD data in the continuous model. A limitation of the taxonomy is that it does not treat temporal data as distinct from 1D data.
  • 57. 2.4 Information Uncertainty Visualization Approaches 29 Figure 2.11: The Model-based Visualization Taxonomy 2.4 Information Uncertainty Visualization Approaches Johnson and Sanderson argue that “development of formal theoretical frameworks and the new visual representations of error and uncertainty will be fundamental to a better understanding of 3D experimental and simulation data” [51, pp. 5]. This is a relatively new field in visualization, which is generally referred to as uncertainty visualization. However, uncertainty and its sources are diverse and the term can have broad meaning. On the other hand, error visualization (e.g. [51, 83]) and fuzzy visualization (e.g. [93, 39, 5]) imply particular uncertainty modeling techniques. In this thesis we use the term information uncertainty visualization to refer to visualization of all modeling techniques where the uncertainty can be codified in information. Thus error- and fuzzy- visualization are sub-categories of information uncertainty visualization, which is itself a sub-category of uncertainty visualization. This relationship is shown in Figure 2.12. Visualization techniques map data variables and information to visual feature di- mensions for the purpose of highlighting trends, making comparisons, establishing outliers, examining data composition, and similar reasons. The introduction of uncer- tainty requires that appropriate visual features be selected to represent it. Blurring sim- ulates the visual precept caused by an incorrectly focused visual system and therefore has the most immediately intuitive mapping for uncertainty [91]. Blurring effectively
  • 58. 30 Chapter 2. Background uncertainty visualization information uncertainty visualization error visualization f u z z y visualization Figure 2.12: Relationship between Uncertainty Visualization, Information Uncertainty Visualization, Error Visualization, and Fuzzy Visualization smears the boundary of the graphic representing the data value, creating a sense of uncertainty as to where it begins and ends. A number of visual features may be used in a similar manner, including hue, luminance, saturation, and can be extended into the temporal domain through animation [67, 34, 10]. Brown [10, pp. 84] offers a summary of available features drawn from the literature (e.g. [44, 33, 90]): • Intrinsic representations - position, size, brightness, texture, color, orientation, and shape; • Further related representations - boundary (thickness, texture and color), blur, transparency and extra dimensionality; • Extrinsic representations - dials, thermometers, arrows, bars, different, shapes, and complex objects - pie charts, graphs, bars, or complex error bars 2.4.1 Low-level Features We now consider how several low-level features can be used to indicate uncertainty within information. The features to be considered are: hue and luminance, opacity, depth, texture, particles, glyphs, and sonification.
  • 59. 2.4 Information Uncertainty Visualization Approaches 31 Hue and Luminance are commonly used to highlight data that is different, or to rep- resent gradients in the data [115, 56]. Saturation of the hue can be used to high- light the precision or certainty of the data. The more saturated the hue, the more certain or crisp the value contained in that region is, while low saturation regions have the appearance of washing into each other, and can be used to in- dicate the fuzziness of spatial region boundaries [50, 42]. Variation in hue can also be used to indicate precision. Regions of higher uncertainty can have fewer shades, while more precise areas have a smoother appearance. A lack of back- ground/foreground separation (e.g. red on purple) can also imply uncertainty, as the region may only just be distinguishable [122]. Brown and Pham [93] used the color hues to represent the membership values of data points. Color hues were also used by Lowe et al. [70] to represent belief values in the form of a flame to facilitate decision making in an anaesthetic monitoring system. Opacity offers an intuitive method for creating blurriness. The more uncertain regions can be shown with reduced opacity, creating a ghost-like effect. The inverse ap- proach, used by Djurcilov et al. [24, 25], is to map regions of high uncertainty to high opacity, thus drawing attention to the uncertain areas in volume visu- alization (see Figure 2.13). Johnson and Sanderson [51] show an example of a Magnetic Resonance Imaging (MRI) scan with an added error volume. The error volume represents the space of possible variation and is transparent so that the other data is still visible. Depth can be used to indicate an order or spatial positioning for the data. Pang et al. [86] and Brown [10] displayed intentionally different images to each eye, exploiting a lack of binocular fusion to indicate fuzziness. Blurring or depth of field effects from spatial frequency components being removed in the image plane can be used to show the indistinct nature of data points [34, 64]. Texture may be applied to objects to indicate the level of precision, ambiguity or fuzziness in the spatial location upon an object or upon a spatial location. Pang
  • 60. 32 Chapter 2. Background Figure 2.13: Using Opacity to Show the Structure of Uncertainty. Color Scheme (left), Normal Rendering (center), Uncertainty Structure (right) and Alper [84] used random normal perturbation to create a textured surface. The effect was proportional to the amount of uncertainty, creating rough regions where the uncertainty is high. Certain shimmering effects, usually to be avoided in visualization [115], can be used to indicate ambiguity within the region [111]. Particles can be used to represent the uncertainty of a region or object by varying their density, opacity, and color. Grigoryan and Rheingans [37, 38] use particle density to indicate uncertainty. These particle clouds create a similar effect to transparent volumes. Cartography often also uses a form of this by drawing dashed lines to represent imprecise lines and boundaries, or by using different dot densities to represent shading effects [36]. Glyphs are the most widespread methods for displaying uncertainty. The size of a glyph is often used to indicate a scalar measure of uncertainty. For example, error bars are a traditional technique for indicating errors in measurement [115]. The larger the error bar, the more uncertainty there is. This concept was expanded upon by Pang and Freeman [85], who used the size of spherical and ellipsoidal glyphs to indicate uncertainty in radiosity applications. Lodha et al. [67] inves- tigated uncertainty glyphs for flow visualizations, also using length to indicate degree of disagreement. In separate work [68] they used glyphs to show variation between surface interpolants, finding them to be more precise than using other
  • 61. 2.4 Information Uncertainty Visualization Approaches 33 features. Wittenbrink et al. [123] mapped variation in vectors to glyph length and width, to show uncertainty in magnitude and direction. In the same work they explored glyphs in keyframed animation to expose differences between in- terpolation techniques. Sonification is an approach that was explored early on. There are two main meth- ods, one is to map the uncertainty directly to the pitch or volume, while the second uses the degree of uncertainty to regulate a noise generator. Fisher [31] allowed the user to scan a cursor over a landscape while the program emitted sound depending on the degree of uncertainty. Lodha et al. [66] went further by allowing multiple sound variables to be mapped simultaneously, thus increased the amount of information conveyed. These low-level features offer an added dimension to which we can map uncertainty information for a particular plot point. Zhou and Pang [124] looked at several examples to visualize the level of error between original and reduced resolution meshes in a multi-resolution mesh algorithm (see Figure 2.14). We now consider how these are used in higher level constructions and methods that require multiple data points. In our discussion we include different spatial arrangements, use of image based techniques, addition and modification of geometry, and the use of animation. Figure 2.14: Visual Mappings Showing Difference. From Left to Right: Overlay, Rainbow Mapping, White-black-white Pseudo-coloring, Glyph (Hi-pass), Glyph (low- pass)
  • 62. 34 Chapter 2. Background 2.4.2 Higher-level Constructions Uncertainty can be represented in several ways using 2D Cartesian graphs. Some examples of graphs include histograms, bar charts, tree diagrams, time histories of 1D slices, maps, iconic and glyph-based diagrams. For example, graphs are often used to represent the fuzzy membership functions (e.g. Figures 2.1-2.4) or probability density functions. The structure and inter-relationships of rules can be illustrated using graphs, trees and flowcharts. Fuzzy rules involving two inputs can be graphed in three dimensions. Figure 2.15 shows an example from the Matlab Fuzzy Toolbox [75], where the output shows the amount of tip, as determined by the quality of the food and service. Nürnberger ex- plored drawing such classifiers as overlapping pyramid shapes. 2D classifiers are visu- alized as contours for a top-down view [80], whereas 3D classifiers are 3D shapes. An extension to this work discusses the effects that antecedent pruning has on the shapes [81]. Pruning of antecedents involves removal of restrictive rules and simplification of existing rules with the aim of improving the ability of the classification system to generalize to previously unseen input data. The authors argue that rule simplifications can have a dramatic impact on results and that visualization of these changes can pro- vide an intuitive aid for fuzzy classifier designers. While the technique produces an intuitive aid, the authors have not gone far enough. Since the classifier is visualized as a shape that occupies the same space as the data, it suggests that it can be visualized together with the data. This would allow the user to observe how the data points of a particular data set classify, particularly when combined with animation or interactive techniques. Possible extensions include using size, color, and translucency to enhance perception of the classification given to a data point. Cox et al. [20] applied thresholds to produce convex hull plots of data point clusters, using glyphs of different shapes and sizes for the data points. A limitation of these techniques is that they are not well suited to multi-dimensional data. Techniques such as multi-dimensional scaling [6] and parallel coordinates [39]
  • 63. 2.4 Information Uncertainty Visualization Approaches 35 Figure 2.15: Tip Level Based on the Quality of the Food and Service Using Fuzzy Inference provide ways to display multi-dimensional fuzzy data in 2D without loss of informa- tion. However, the degree of membership is not indicated in a standard parallel coordi- nate plot. Berthold and Hall [5] use blurring to expose the level of fuzzy membership on parallel coordinates. An alternative proposal by Pham and Brown [90] extends coordinates to the third dimension, where the new dimension represents the member- ship value. One technique for multi-dimensional scaling involves an algorithm that minimizes the inter-point distances. The rule set is then visualized as a 2D scatter plot, where gray scales denote different classes and the size of each square indicates the number of examples [93]. Another technique for viewing high-dimensional fuzzy rules in 2D places rules as shapes on a grid. The distance between rules in high- dimensional space is mapped to their distance in 2D. The technique uses a gradient descent algorithm to minimize the error between the 2D and actual distances [6]. When visualizing clusters it is often a requirement to find outliers in the data. One method to improve the identification of outliers in fuzzy classification problems is to modify the “objective function”. Keller proposed additional weighting parameters for “representativeness” [55, pp. 143]. The application of this technique produces the same principal clusters, but outliers are more easily detected since they are excluded to a greater degree from the fuzzy clusters. Fujiwara et al. [32] and Gershon [34] produced a 3D flowchart to represent rule structure to facilitate understanding of rule-based programs. This is an extension of the cone tree visualization technique [101]. Dickerson et al. [23] used a graph to
  • 64. 36 Chapter 2. Background encode relationships in a complex interacting system. This technique is useful for encoding expert information which is commonly present in fuzzy control systems. Brown and Pham [11] extended these techniques further to by mapping additional features to uncertainty (such as opacity) for each node. Image based techniques can also be used to convey uncertainty. These methods are the uncertainty analogs of image based visualization techniques such as Line Integral Convolution (LIC) [40]. In these methods a pattern is generated that abstractly reflects the uncertainty. One difference between image based techniques and glyphs is that image based techniques apply a regular pattern over a continuous area. This avoids clutter sometimes experienced by glyph techniques where the glyphs obstruct one an- other. Sanderson et al. [103] used reaction-diffusion models in flow visualizations and conveyed uncertainty through spot size and orientation. Pang and Freeman [85] (see also [86]) observed that geometry can be added or modified to indicate uncertainty. An example of modification is to create a texturing- like effect by perturbing the orientation of faces within a geometric mesh model. The amount of perturbation is governed by the degree of uncertainty. There are two com- mon examples of adding geometry. The first is to add geometry for a single data point, typically to give a direct indication of the extent over which the object can exist. An- other is to connect successive data point extents, simulating the volume of possibility. An example of the latter was demonstrated by Lopes and Brodlie [69], who used tubes for particle flow visualization. All of the low-level visual features that have been discussed can be animated. For example, using motion blur, flickering, animated glyphs, etc. to represent the precision of the measurements of a moving object [123, 10]. Brown [10] explored temporal vi- brations for conveying uncertainty. The vibrations oscillate between values fast enough to be pre-attentive [114], causing a shimmering effect that implies uncertainty about its true position. Figure 2.16 shows frames from a movie of the luminance oscillation technique, with a region of high uncertainty framed within the dashed rectangle. This technique can also be applied in stereographic displays to facilitate a lack of binocular
  • 65. 2.4 Information Uncertainty Visualization Approaches 37 fusion [10]. Figure 2.16: Two Frames from a Luminosity Oscillation Animation Probability distributions are often graphed as 2D line graphs. Kao et al. [52, 53, 54] explored showing multiple data points, each of which is subject to a probability distribution. In one example they overlaid the probability density functions for points of interest, as shown in Figure 2.17. Other approaches included using color, texture, and heightmaps to indicate uncertainty. Luo et al. [72] plotted many small histograms in a small multiples [115] technique. Figure 2.17: Visualization With Probability Density Functions Over Associated Data Points The Geographic Information Systems (GIS) field has had a particular interest in information uncertainty visualization. MacEachren et al. [74] and Slocum et al. [107] methodically review the state of play with respect to uncertainty visualization in GIS.
  • 66. 38 Chapter 2. Background Outside of this field, the development of visualization techniques for information un- certainty is typically ad hoc, being created for a specific modeling technique or appli- cation. All of these represent important steps forward, however, an integrated frame- work to manage the modeling and visualization of information uncertainty is currently missing. 2.5 Summary In summary, the information uncertainty modeling and general visualization fields have separately been well studied. Several visualization techniques have been created for the various information uncertainty models. However, which one is best suited to the task at hand, and how will uncertainty modeling and propagation be tracked and interpreted properly? What happens when the information uncertainty changes? In- formation uncertainty modeling represents our knowledge and expectation about the behavior of a variable under uncertainty, and this knowledge may be subject to change over time, particularly as new information comes to light. Currently, there is no inte- grated framework for the modeling, propagation, and visual mapping of information uncertainty. Furthermore, there is no framework that can adapt to changes in informa- tion uncertainty.
  • 67. CHAPTER 3 Framework for Integrated Uncertainty Modeling and Visualization 3.1 A New Approach to Information Uncertainty Traditional visualization systems, which typically do not deal with information uncer- tainty, can still be subject to dynamic data. For these systems the dynamism refers to changes in value. The result is that the visualization needs to be recalculated, which is a straight forward process. Changes in information uncertainty, on the other hand, pro- vide a unique challenge: the actual modeling technique used can change in response to changing information. Therefore, the data-type of the variable can be dynamic. This is clearly illustrated by the case of prediction: before the event comes to pass, the un- certainty can be modeled using a number of techniques; once the event has passed, the prediction can be updated with the actual outcome. 1Thus the visualization must not only be recalculated, but must also adapt to this new data-type. Adapting to new data-types for a visualization is not a straight forward process. 1Assuming the outcome is known, the data-type then becomes one of absolute certainty.
  • 68. 40 Chapter 3. Framework for Integrated Uncertainty Modeling and Visualization Visualization techniques are designed for a particular data-type and may not support the new data-type without modification. For example, line graphs rely on a series of values between which line segments are connected. Should the source of information be defined by a series of intervals, then the traditional line graph is no longer appro- priate. One suitable modification turns the line segments into convex polygons, whose edges are defined by the upper and lower bounds of the interval. It is recognized that it is important that visualization systems convey uncertainty [83]. Many problems are subject to uncertainty and, as a consequence, visualization research has produced several visualization technique modifications to support the uncertainty. For example, transparent volumes have been added to volume renderings to indicate potential error (e.g. [51, pp.9]), and parallel plots were extended into the third dimen- sion to handle fuzzy variables [93]. This type of work continues, and the outcomes continue to be data-type specific. The objective of this thesis is to integrate the process of modeling and visualizing information uncertainty into an extensible and adaptive visualization framework. Such a framework will provide greater uniformity for the field and enable both practitioners and researchers to reduce the data-type and visualization technique dependency. The process that the user follows when armed with such a tool can therefore change. The typical process that a user follows when dealing with uncertainty consists of the following steps. 1. Decide on variables 2. Decide on uncertainty data-type(s) 3. Build the data model, propagating uncertainty manually 4. Construct visualization(s), incorporating uncertainty where techniques are avail- able and appropriate In practice, steps 3 and 4 will be repeated as information changes. However, step 2 will rarely be revisited. The significant point of this process is that the uncertainty model
  • 69. 3.2 Analysis of Issues and Requirements 41 is decided upon before the user’s data model is built. This can be unintuitive, as the amount of uncertainty can change depending on how the model pans out. If it were easy to add, change, or remove uncertainty details at any point in the process, then the typical process changes, as follows. 1. Decide on variables 2. Build an initial data model 3. Construct visualization(s) 4. Add/remove/change uncertainty information Where step 4 can occur anywhere after step 1, and can be repeated as often as is necessary. Under such a process the uncertainty information is viewed as a refinement on details that does not fundamentally change the data model. This chapter describes an integrated framework for the modeling and visualization of information uncertainty. This framework is adaptive to changes in uncertainty in- formation, allowing the user to select the appropriate techniques for the task at hand. In Section 3.2 we consider the issues that must be overcome, from which we derive re- quirements of this framework. Section 3.3 describes the components of the framework to meet the requirements. Section 3.4 provides a summary of key points. 3.2 Analysis of Issues and Requirements This section examines the issues that confront users when they seek to visualize infor- mation uncertainty. From a theoretical perspective there are three main issues. Firstly, visualization techniques are based around specific uncertainty data-types. Thus, vi- sualization techniques tend to be ad hoc. Secondly, there is incoherence between in- formation uncertainty modeling techniques. This locks users into a particular model- ing technique, the appropriateness of which may change as the information evolves. Thirdly, information uncertainty modeling and visualization is hampered by an artifi- cial separation between the value of a variable and the uncertainty model of that value.
  • 70. 42 Chapter 3. Framework for Integrated Uncertainty Modeling and Visualization This poses problems that affect the robustness of user models and the effort required to maintain them. From a practical point of view, the user is required to have both a comprehen- sive understanding of uncertainty as well as sophistication with visualization tools. Comprehensive understanding is required, because the user must manually encode and propagate uncertainty information; sophistication with visualization tools is required to allow for the unusual demands of mapping uncertainty to visual elements. Many tools lack support for information uncertainty modeling and visualization, leading de- termined users to cobble together multiple tools. 3.2.1 Ad hoc Visualization Techniques Sensemaking Cycle and Changes in Uncertainty Information Visualization is “the bringing out of meaning in information” [56]. It is performed it- eratively and usually as part of the sensemaking cycle [102, 17]. The iterative looping is not exclusive to mapping data into visual form; instead, users sometimes return to the data model to gather or transform data. This is particularly true for information un- certainty. For example: uncertainty details can be deemed to be more important later, once the basic model is in place; or the uncertainty details may change as more be- comes known about the variables. Therefore, frameworks for information uncertainty visualization should ideally allow the user to go back to make changes with minimal effort. Flexibility Visualization of information uncertainty is different to visualizing other forms of in- formation for two main reasons. Firstly, information uncertainty is always associated with a particular unit of information. This means that the uncertainty cannot be freely visualized without regard to its interpretation relative to the information to which it belongs. Secondly, information uncertainty is usually mapped differently to visual el- ements. For example, uncertainty is commonly mapped to intrinsic properties, such as
  • 71. 3.2 Analysis of Issues and Requirements 43 transparency or color; or by adding a dimension to geometry, such as using a surface where there would otherwise be a line. Therefore, a visualization system for informa- tion uncertainty requires the flexibility to allow users to map uncertainty to compound visual elements, including intrinsic properties and adding dimensions to geometry. Figure 3.1 demonstrates how information uncertainty is associated with informa- tion, but typically mapped differently to visual elements. Four graph visualizations of historical and predicted employment rates in California are shown. The first graph (a) assumes that growth will continue at the average growth rate of the past 15 years and is therefore visualized using traditional means. While the information in graph (a) is modeled as not being subject to uncertainty, it requires the unreliable assumption about employment rates to be made. The graph in (b) estimates that the growth will continue at the average rate. The fact that the predictions are estimates is indicated by the line stippling, an intrinsic property of the line. The graph in (c) shows the possible range within the maximum and minimum growth rates experienced in the past 15 years. The uncertainty is indicated by extending the one dimensional line into a two dimensional polygon. The graph in (d) uses a normal distribution centered on the average growth rate. The uncertainty is indicated by both extending the dimensionality of the line as well mapping to the intrinsic property of opacity. Heterogeneity in Uncertainty Information Several uncertainty visualization techniques have been developed for particular uncer- tainty types. However, in an environment where the uncertainty type can change to bet- ter suit the needs of the user, such restrictive preconditions for visualization techniques provide a return to the tyranny of uncertainty type lock-in. Therefore the approach to visualization of information uncertainty requires visualization to provide greater con- sistency across different uncertainty modeling techniques.
  • 72. 44 Chapter 3. Framework for Integrated Uncertainty Modeling and Visualization (a) (b) (c) (d) Figure 3.1: Visualizations of Employment Numbers in California. Years 2005-2010 are Predicted. (a) Assuming Average Growth (b) Indicating Growth is Estimated (c) Possible Growth (d) Likely Growth. (Data Source: California Employment Develop- ment Department)
  • 73. 3.2 Analysis of Issues and Requirements 45 Homogeneous Access To enable the visual mappings that expose the uncertainty in variables, it is necessary to have access to the associated uncertainty details. However, there are numerous un- certainty modeling techniques that use different methods for encoding the uncertainty. This creates a barrier to visualizing uncertain information because visual mappings that work with one uncertainty modeling technique may not be easily transferable to another. Such inconsistency creates a strong dependency between visualizations and the data-types used in the model, limiting the user’s ability to update the data model. Therefore, a generalized means for accessing uncertainty information should be sought to enable a consistent environment information uncertainty visualization. Plurality of Values Fundamental to the concept of information uncertainty is the ability for a variable to hold multiple values simultaneously; in other words, the variable has multiple possible collapses. This plurality of values represents the deferral of the approximation decision - the true value of a variable may be one of multiple candidates, each of which should be considered a possibility. 3.2.2 Incoherence of Uncertainty Models Uncertainty Data Type Lock-in There is usually no support for changing from one uncertainty modeling technique to another. Adding uncertainty information to data allows the user to specify a greater level of detail about the data. However, changing the uncertainty data-type typically requires users to reconstruct the affected portion of the data model, often involving a fundamental change in form. This makes the data model rigid and, as a consequence, users will typically need to anticipate their use of uncertainty and build their model accordingly.