SlideShare a Scribd company logo
1 of 127
Applications
Artificial Intelligence
And
Simulations
Applications
Pallet of
Data Structures
Algorithms
Choose which to use and
Combine them To form your model
Of reality
Reality to modelThe modelerThe computer
You are the artist and the computer is your
canvas
Knowledge Representation
Abstraction
You choose how to represent reality
The choice is not unique
It depends on what aspect of reality you want to represent and how
Applications:
Acquisition, management and use
of knowledge
Theme of lecture:
Abstractionof reality through knowledge engineering
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Problem Solving
• Use of knowledge in and as models
– Problem Solving
– Simulations
Storing and Managing Information
Table of data
Database management Systems (DBMS)
Storage and retrieval of properties of objects
Spreadsheets
Manipulations of and calculations with the data in the table
Each row is a particular object
Each column is a property associated with that objects
Two examples/paradigms of management systems
Database Management System (DBMS)
Organizes data
in sets of
tables
Relational Database Management System
(RDBMS)
Name Address Parcel #
John Smith 18 Lawyers Dr.756554
T. Brown 14 Summers Tr. 887419
Table A
Table B
Parcel # Assessed Value
887419 152,000
446397 100,000
Provides relationships
Between
data in the tables
Using SQL- Structured Query Language
• SQL is a standard database protocol, adopted by most
‘relational’ databases
• Provides syntax for data:
– Definition
– Retrieval
– Functions (COUNT, SUM, MIN, MAX, etc)
– Updates and Deletes
• SELECT list FROM table WHERE condition
• list - a list of items or * for all items
o WHERE - a logical expression limiting the number of records selected
o can be combined with Boolean logic: AND, OR, NOT
o ORDER may be used to format results
Spreadsheets
Every row is
a different “object”
with a set of properties
Every column is
a different property
of the row object
Spreadsheet
Organization of elements
Column A Column B Column C
Row 1
Row 2
Row 3
Row and column
indicies
Cells with addresses
A7 B4 C10 D5
Accessing each cell
Spreadsheet Formulas
Formula: Combination of
values or cell references
and mathematical
operators such as +, -, /, *
The formula displays in
the entry bar. This
formula is used to add the
values in the four cells.
The sum is displayed in
cell B7.
The results of a formula
display in the cell.
With cell, row and column functions
Ex. Average, sum, min,max,
Visualizing data:
Charts
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Use of knowledge in and as models
– Problem Solving
– Simulations
Making Sense of Knowledge
Time flies like an arrow proverb
Fruit flies like a banana Groucho Marx
There is a semantic and context behind all words
Flies:
1. The act of flying
2. The insect
Like:
1. Similar to
2. Are fond of
There is also the elusive “Common Sense”
1. One type of fly, the fruit fly, is fond of bananas
2. Fruit, in general, flies through the air just like a banana
3. One type of fly, the fruit fly, is just like a banana
A bit complicated because we are speaking metaphorically,
Time is not really an object, like a bird, which flies
Translation is not just doing a one-to-one search in the dictionary
Complex Searches is not just searching for individual words
Google translate
Adding Semantics:
Ontologies
Concept
conceptual entity of the domain
Attribute
property of a concept
Relation
relationship between concepts
or properties
Axiom
coherent description between
Concepts / Properties /
Relations via logical expressions
16
Person
Student Professor
Lecture
isA – hierarchy (taxonomy)
name email
student
nr.
research
field
topic
lecture
nr.
attends
holds
Structuring of:
• Background Knowledge
• “Common Sense” knowledge
Structure of an Ontology
Ontologies typically have two distinct components:
Names for important concepts in the domain
– Elephant is a concept whose members are a kind of animal
– Herbivore is a concept whose members are exactly those animals who eat
only plants or parts of plants
– Adult_Elephant is a concept whose members are exactly those elephants
whose age is greater than 20 years
Background knowledge/constraints on the domain
– Adult_Elephants weigh at least 2,000 kg
– All Elephants are either African_Elephants or Indian_Elephants
– No individual can be both a Herbivore and a Carnivore
17
Ontology Definition
18
Formal, explicit specification of a shared conceptualization
commonly accepted
understanding
conceptual model
of a domain
(ontological theory)
unambiguous
terminology definitions
machine-readability
with computational
semantics
[Gruber93]
The Semantic Web
Ontology implementation
19
"The Semantic Web is an extension of the current web in which information is given
well-defined meaning, better enabling computers and people to work in
cooperation." -- Tim Berners-Lee
“the wedding cake”
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquisition
– Concept Abstraction
• Use of knowledge in and as models
– Problem Solving
– Simulations
Abstracting
Knowledge
Several levels and reasons to abstract knowledge
Feature abstraction
Simplifying “reality” so the know can be used in
Computer data structures and algorithms
Concept Abstraction
Organizing and making sense of the immense amount of
data/knowledge we have
Modeling abstraction
Making usable and predictive models of reality
Feature Abstraction
Simplifying “reality” so the knowledge can be used in
Computer data structures and algorithms
A photograph of a face
Set
Of
pixels
Is it a face?
Who’s face?
Feature Abstraction
Simplifying “reality” so the knowledge can be used in
Computer data structures and algorithms
A photograph
of a face
Is it a face?
Who’s face?
The eye sees the pixels
In the visual cortex,
Features are detected
Feature Abstraction
Simplifying “reality” so the knowledge can be used in
Computer data structures and algorithms
n!º
1 n =1
n*(n-1)! n >1
ì
í
ï
îï
ü
ý
ï
þï
43210 5
76 8 90 1 2 3 4 5
Photograph made up of pixels
The pixels need to be converted to
Data structures the algorithms can understand
FeatureAbstract:
Boundary Detection
• Is this a boundary?
Feature Detection
“flat” region:
no change in all
directions
“edge”:
no change along
the edge direction
“corner”:
significant change
in all directions
Harris Detector: Intuition
From a square sampling of pixels
Principle Component Analysis (PCA)
27
• Finding a map of principle components (PCs) of data into an
orthogonal space
• Method: Find the set of eigenvalues in a vector space:
– The eigen vectors are the principle components
– The eigenvalues are the ranking of the vectors
• PCs – Variables with the largest variances
– Orthogonality (each coordinate is orthogonal)
– Linearity – Optimal least mean-square error
• Limitations?
– Strict linearity
– specific distribution
– Large variance assumption
x1
x2
Rotates coordinate system
Feature Detection
 ( , ) ,
u
E u v u v M
v
 
  
 
Intensity change in shifting window: eigenvalue analysis
1, 2 – eigenvalues of M
direction of the
slowest change
direction of the
fastest change
(max)-1/2
(min)-1/2
Ellipse E(u,v) = const
Harris Detector: Mathematics of the analysis of pixels
Transformation of coordinates
Principle component analysis
Can reduce the set of coordinates
One coordinate
The other coordinate is noise
(all points are “shifted” to the Principle component)
Harris Detector: Mathematics
1
2
“Corner”
1 and 2 are large,
1 ~ 2;
E increases in all
directions
1 and 2 are small;
E is almost constant
in all directions
“Edge”
1 >> 2
“Edge”
2 >> 1
“Flat”
region
Classification
of the
new coordinates
PCA: Feature from pixels
1
2
“Corner”
1 and 2 are large,
1 ~ 2;
E increases in all
directions
“Edge”
1 >> 2
“Edge”
2 >> 1
“Flat”
region
One principle component
Along the line
The other component is
small
Note that line can be in any direction
Principle component follows line
Rotation invariant
1
2
“Corner”
1 and 2 are large,
1 ~ 2;
E increases in all
directions
“Edge”
1 >> 2
“Edge”
2 >> 1
“Flat”
region
PCA: Feature from pixels
There is no line
No principle component
PCA: Feature from pixels
1
2
“Corner”
1 and 2 are large,
1 ~ 2;
E increases in all
directions
“Edge”
1 >> 2
“Edge”
2 >> 1
“Flat”
region
There are two lines
(almost) in orthogonal
(perpendicular)
Directions
Two principle components
Feature Detection
Ellipse rotates but its shape (i.e. eigenvalues) remains
the same
Corner response R is invariant to image rotation
Important property: Rotationally invariant
SIFT Descriptor
• 16x16 Gradient window is taken. Partitioned into 4x4 subwindows.
• Histogram of 4x4 samples in 8 directions
• Gaussian weighting around center( is 0.5 times that of the scale of a
keypoint)
• 4x4x8 = 128 dimensional feature vector

Another localized feature from the pixels
Feature Detection
• Use the
scale/orientation to
determined by
detector to in a
normalized frame.
• compute a descriptor
in this frame.
Scale example:
• moments integrated over an adapted window
• derivatives adapted to scale: sIx
Scale & orientation example:
Resample all points/regions to 11X11 pixels
• PCA coefficients
•Principle components of all points.
SIFT Descriptors also invariant to Scale/Orientation
Feature Abstraction
Simplifying “reality” so the knowledge can be used in
Computer data structures and algorithms
n!º
1 n =1
n*(n-1)! n >1
ì
í
ï
îï
ü
ý
ï
þï
43210 5
76 8 90 1 2 3 4 5
New “features”
represented
in data structures that can be used
in algorithms
Hierarchy of analysis
Hierarchy of features
Simple primitive features
Complex combinations
of simple features
Face detection
Example: Face Detection
• Scan window over image
• Classify window as either:
– Face
– Non-face
ClassifierWindow
Face
Non-face
From the established features
Face Detection Algorithm
Face Localization
Lighting Compensation
Skin Color Detection
Color Space Transformation
Variance-based Segmentation
Connected Component &
Grouping
Face Boundary Detection
Verifying/ Weighting
Eyes-Mouth Triangles
Eye/ Mouth Detection
Facial Feature Detection
Input Image
Output Image
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Use of knowledge in and as models
– Problem Solving
– Simulations
Concept Abstraction
Organizing and making sense of the immense amount of
data/knowledge we have
Generalization
The ability of an algorithm to perform accurately on new, unseen
examples after having trained on a learning data set
Generalization
Consider the following regression problem:
Predict real value on the y-axis from the real value on the x-axis.
You are given 6 examples: {Xi,Yi}.
X*
What is the y-value for a new query ?
Generalization
X*
What is the y-value for a new query ?
Generalization
X*
What is the y-value for a new query ?
Generalization
which curve is best?
X*
What is the y-value for a new query ?
Generalization
Occam’s razor:
prefer the
simplest hypothesis
consistent with data.
Have to find
a balance
of constraints
Two Schools of Thought
48
1. Statistical “Learning”
The data is reduced to vectors of numbers
Statistical techniques are used for the tasks to be performed.
2. Structural “Learning”
The data is converted to a discrete structure
(such as a grammar or a graph) and the
techniques are related to computer science
subjects (such as parsing and graph matching).
A spectrum of machine learning tasks
• High-dimensional data (e.g. more
than 100 dimensions)
• The noise is not sufficient to
obscure the structure in the data
if we process it right.
• There is a huge amount of
structure in the data, but the
structure is too complicated to be
represented by a simple model.
• The main problem is figuring out
a way to represent the
complicated structure that allows
it to be learned.
• Low-dimensional data (e.g. less
than 100 dimensions)
• Lots of noise in the data
• There is not much structure in the
data, and what structure there is,
can be represented by a fairly
simple model.
• The main problem is
distinguishing true structure from
noise.
Statistics--------------------- Artificial Intelligence
Supervised
learning
Un-Supervised
learning
Concept Acquisition
Statistics
learning with the presence of an expert
Data is labelled with a class or value
Goal:: predict class or value label
c1
c2
c3
Supervised Learning
Learn a properties of a classification
Decision making
Predict (classify) sample → discrete set of class labels
e.g. C = {object 1, object 2 … } for recognition task
e.g. C = {object, !object} for detection task
Spa
m
No-
Spam
learning without the presence of an expert
Data is unlabelled with a class or value
Goal::
determine data patterns/groupings
and the properties of that classification
Unsupervised Learning
Association or clustering::
grouping a set of instances by attribute similarity
e.g. image segmentation
Key concept: Similarity
Statistical Methods
Regression::
Predict sample → associated real (continuous) value
e.g. data fitting
x
1
x
2
Learning within the constraints of the method
Data is basically n-dimensional set of numerical attributes
Deterministic/Mathematical algorithms based on
probability distributions
Principle Component Analysis::
Transform to a new (simpler) set of coordinates
e.g. find the major component of the data
Pattern Recognition
Another name for machine learning
• A pattern is an object, process or event that can be given a
name.
• A pattern class (or category) is a set of patterns sharing
common attributes and usually originating from the same
source.
• During recognition (or classification) given objects are
assigned to prescribed classes.
• A classifier is a machine which performs classification.
“The assignment of a physical object or event to one of several prespecified
categeries” -- Duda & Hart
Cross-Validation
In the mathematics of statistics
A mathematical definition of the error
Function of the probability distribution
Average
Standard deviation
In machine learning,
no such distribution exists
Full
Data set
Training set
Test set
Build the ML
Data structure
Determine Error
Classification algorithms
– Fisher linear discriminant
– KNN
– Decision tree
– Neural networks
– SVM
– Naïve bayes
– Adaboost
– Many many more ….
– Each one has its properties wrt bias, speed,
accuracy, transparency…
Feature extraction
Task: to extract features which are good for classification.
Good features: • Objects from the same class have similar feature values.
• Objects from different classes have different values.
“Good” features “Bad” features
Similarity
Two objects
belong to the
same classification
If
The are “close”
x1
x2
?
?
?
?
?
Distance between them is small
Need a function
F(object1, object1) = “distance” between them
Similarity measure
Distance metric
• How do we measure what it means to be “close”?
• Depending on the problem we should choose an appropriate
distance metric.
For example: Least squares distance
f (a,b) = (ai -bi )2
i=1
n
å
Types of Model
Discriminative Generative
Generative vs. Discriminative
Overfitting and underfitting
Problem: how rich class of classifications q(x;θ) to use.
underfitting overfittinggood fit
Problem of generalization: a small emprical risk Remp does not imply small true expected
risk R.
Generative
Cluster Analysis
Create “clusters”
Depending on distance metric
Hierarchial
Based on “how close”
Objects are
KNN – K nearest neighbors
x1
x2
?
?
?
?
– Find the k nearest neighbors of the test example , and infer
its class using their known class.
– E.g. K=3
?
Discrimitive:
Support Vector Machine
• Q: How to draw the optimal linear
separating hyperplane?
 A: Maximizing margin
• Margin maximization
– The distance between H+1 and H-1:
– Thus, ||w|| should be minimized
64
Margin
Prediction Based on Bayes’ Theorem
• Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes’ theorem
• Informally, this can be viewed as
posteriori = likelihood x prior/evidence
• Predicts X belongs to Ci iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes
• Practical difficulty: It requires initial knowledge of many
probabilities, involving significant computational cost
65
)(/)()|(
)(
)()|()|( XX
X
XX PHPHP
P
HPHPHP 
Naïve Bayes Classifier
age income studentcredit_ratingbuys_comput
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
66
Class:
C1:buys_computer = ‘yes’
C2:buys_computer = ‘no’
P(buys_computer = “yes”)
= 9/14 = 0.643
P(buys_computer = “no”)
= 5/14= 0.357
X = (age <= 30 , income = medium, student =
yes, credit_rating = fair)
Naïve Bayes Classifier
age income studentcredit_ratingbuys_comput
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
67
Class:
C1:buys_computer = ‘yes’
C2:buys_computer = ‘no’
Want to classify
X =
(age <= 30 ,
income = medium,
student = yes,
credit_rating = fair)
Will X buy a computer?
Naïve Bayes Classifier
68
Key: Conditional probability
P(X|Y) The probability that X is true, given Y
P(not rain| sunny) > P(rain | sunny)
P(not rain| not sunny) < P(rain | not sunny)
Classifier: Have to include the probability of the condition
P(not rain | sunny)*P(sunny)
How often did it really not rain, given that it was actually sunny
Naïve Bayes Classifier
69
Class:
C1:buys_computer = ‘yes’
C2:buys_computer = ‘no’
Want to classify
X =
(age <= 30 ,
income = medium,
student = yes,
credit_rating = fair)
Will X buy a computer?
Which “conditional probability” is greater?
P(X|C1)*P(C1) > P(X|C2) *P(C2) X will buy a computer
P(X|C1) *P(C1) < P(X|C2) *P(C2) X will not buy a computer
Naïve Bayes Classifier
age income studentcredit_ratingbuys_comput
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
70
Class:
C1:buys_computer = ‘yes’
C2:buys_computer = ‘no’
X =
(age <= 30 ,
income = medium,
student = yes,
credit_rating = fair)
P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6
Naïve Bayes Classifier
• Compute P(X|Ci) for each class
P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444
P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4
P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4
71
Naïve Bayes Classifier
P(X|Ci) :
P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) :
P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028
P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007
Therefore, X belongs to class (“buys_computer = yes”)
Bigger
Decision Tree Classifier
Ross Quinlan
AntennaLength
10
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
Abdomen Length
Abdomen Length > 7.1?
no yes
KatydidAntenna Length > 6.0?
no yes
KatydidGrasshopper
Grasshopper
Antennae shorter than body?
Cricket
Foretiba has ears?
Katydids Camel Cricket
Yes
Yes
Yes
No
No
3 Tarsi?
No
Decision trees predate computers
• Decision tree
– A flow-chart-like tree structure
– Internal node denotes a test on an attribute
– Branch represents an outcome of the test
– Leaf nodes represent class labels or class distribution
• Decision tree generation consists of two phases
– Tree construction
• At start, all the training examples are at the root
• Partition examples recursively based on selected attributes
– Tree pruning
• Identify and remove branches that reflect noise or outliers
• Use of decision tree: Classifying an unknown sample
– Test the attribute values of the sample against the decision tree
Decision Tree Classification
• Basic algorithm (a greedy algorithm)
– Tree is constructed in a top-down recursive divide-and-conquer manner
– At start, all the training examples are at the root
– Attributes are categorical (if continuous-valued, they can be discretized
in advance)
– Examples are partitioned recursively based on selected attributes.
– Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)
• Conditions for stopping partitioning
– All samples for a given node belong to the same class
– There are no remaining attributes for further partitioning – majority
voting is employed for classifying the leaf
– There are no samples left
How do we construct the decision tree?
Information Gain as A Splitting Criteria
• Select the attribute with the highest information gain (information gain is the
expected reduction in entropy).
• Assume there are two classes, P and N
– Let the set of examples S contain p elements of class P and n elements of
class N
– The amount of information, needed to decide if an arbitrary example in S
belongs to P or N is defined as















np
n
np
n
np
p
np
p
SE 22 loglog)(
0 log(0) is defined as 0
nformation Gain in Decision Tree Induction
• Assume that using attribute A, a current set will be
partitioned into some number of child sets
• The encoding information that would be gained by
branching on A
)()()( setschildallEsetCurrentEAGain 
Note: entropy is at its minimum if the collection of objects is completely uniform
Person Hair
Length
Weight Age Class
Homer 0” 250 36 M
Marge 10” 150 34 F
Bart 2” 90 10 M
Lisa 6” 78 8 F
Maggie 4” 20 1 F
Abe 1” 170 70 M
Selma 8” 160 41 F
Otto 10” 180 38 M
Krusty 6” 200 45 M
Comic 8” 290 38 ?
Hair Length <= 5?
yes no
Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9)
= 0.9911















np
n
np
n
np
p
np
p
SEntropy 22 loglog)(
Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911
)()()( setschildallEsetCurrentEAGain 
Let us try splitting on
Hair length
Weight <= 160?
yes no
Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9)
= 0.9911















np
n
np
n
np
p
np
p
SEntropy 22 loglog)(
Gain(Weight <= 160) = 0.9911 – (5/9 * 0.7219 + 4/9 * 0 ) = 0.5900
)()()( setschildallEsetCurrentEAGain 
Let us try splitting on
Weight
age <= 40?
yes no
Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9)
= 0.9911















np
n
np
n
np
p
np
p
SEntropy 22 loglog)(
Gain(Age <= 40) = 0.9911 – (6/9 * 1 + 3/9 * 0.9183 ) = 0.0183
)()()( setschildallEsetCurrentEAGain 
Let us try splitting on
Age
Weight <= 160?
yes no
Hair Length <= 2?
yes no
Of the 3 features we had, Weight was best.
But while people who weigh over 160 are
perfectly classified (as males), the under 160
people are not perfectly classified… So we
simply recurse!
This time we find that we can split on
Hair length, and we are done!
Weight <= 160?
yes no
Hair Length <= 2?
yes no
We need don’t need to keep the data around,
just the test conditions.
Male
Male Female
How would these
people be
classified?
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Use of knowledge in and as models
– Problem Solving
– Simulation
Using Knowledge
Problem Solving
Simulations
Searching for a solution
Combining models
to form a large comprehensive model
Problem Solving
Basis of the search
Order in which nodes are evaluated and expanded
Determined by Two Lists
OPEN: List of unexpanded nodes
CLOSED: List of expanded nodes
Searching for a solution through all possible solutions
Fundamental algorithm in artificial intelligence
Graph Search
Abstraction:
State of a system
chess
Tic-tak-toe
Water jug problem
Traveling salemen’s problem
In problem solving:
Search for the
steps
leading to the solution
The individual steps
are the
states of the system
Solution Space
The set of all states of the problem
Including the goal state(s)
All possible board combinations
All possible reference points
All possible combinations
Search Space
Each system state
(nodes)
is connected by rules
(connections)
on how to get
from one state to another
Search Space
How the states are connected
Legal moves
Paths between points Possible operations
Strategies to Search
Space of System States
• Breath first search
• Depth first search
• Best first search
Determines order
in which the states are searched
to find solution
Breadth-first searching
• A breadth-first search (BFS)
explores nodes nearest the
root before exploring nodes
further away
• For example, after searching
A, then B, then C, the search
proceeds with D, E, F, G
• Node are explored in the
order A B C D E F G H I J K L
M N O P Q
• J will be found before NL M N O P
G
Q
H JI K
FED
B C
A
Depth-first searching
• A depth-first search (DFS)
explores a path all the way to
a leaf before backtracking and
exploring another path
• For example, after searching
A, then B, then D, the search
backtracks and tries another
path from B
• Node are explored in the
order A B D E H L M N I
O P C F G J K Q
• N will be found before JL M N O P
G
Q
H JI K
FED
B C
A
Breadth First Search
|
| |
||
| | |
| | |
||||
Items between red bars are siblings.
goal is reached or open is empty.
Expand A to new nodes B, C, D
Expand B to new node E,F
Send to back of queue
Queue: FILO
Depth first Search
Expand A to new nodes B, C, D
Expand B to new node E,F
Send to front of stack
Stack: FIFO
Best First Search
Breadth first search: queue (FILO)
Depth first search: stack (FIFO)
Uninformed searches:
No knowledge of how good the current solution is
(are we on the right track?)
Best First Search: Priority Queue
Associated with each node is a heuristic
F(node) = the quality of the node to lead to a final solution
A* search
• Idea: avoid expanding paths that are already expensive
•
• Evaluation function f(n) = g(n) + h(n)
•
• g(n) = cost so far to reach n
• h(n) = estimated cost from n to goal
• f(n) = estimated total cost of path through n to goal
This is the hard/unknown part
If h(n) is an underestimate, then the algorithm is guarenteed to find a solution
Admissible heuristics
• A heuristic h(n) is admissible if for every node n,
h(n) ≤ h*(n), where h*(n) is the true cost to reach
the goal state from n.
• An admissible heuristic never overestimates the cost
to reach the goal, i.e., it is optimistic
• Example: hSLD(n) (never overestimates the actual
road distance)
• Theorem: If h(n) is admissible, A* using TREE-
SEARCH is optimal
Graph Search
Several Structures Used
Graph Search
The graph as search space
Breadth first search Queue
Depth first search Stack
Best first search Priority Queue
Stacks and queues, depending on search strategy
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Use of knowledge in and as models
– Problem Solving
– Simulations
Problem Solving
Simulations
Example: Climate Simulation
Climate Model
Climate Modeling
A multitude of sub-models
submodel
submodel
submodel
submodelsubmodel
submodel
submodel
submodelsubmodel
submodelsubmodel
submodel
submodel
submodel
Many stemming from the techniques discussed previously
Physical processes regulating climate
Physical models representing all the interactions that can occur
Radiation
Even one physical quantity can have many
source models, sink models and interaction models
“Earth System Model”
And ocean model, sea-ice model, land surface model, etc…
3D atmosphere
3D ocean
2D sea ice
Atmospheric
CO2
2D land surface
Land
biogeochemi
stry
Ocean
biogeochem
istry
Ocean sediments
3D ice sheets
Mathematical Models
representing
physical principles
Meteorological Primitive Equations
• Applicable to wide scale of motions; > 1hour,
>100km
Global Climate Model Physics
Terms F, Q, and Sq represent physical processes
• Equations of motion, F
– turbulent transport, generation, and dissipation of
momentum
• Thermodynamic energy equation, Q
– convective-scale transport of heat
– convective-scale sources/sinks of heat (phase change)
– radiative sources/sinks of heat
• Water vapor mass continuity equation
– convective-scale transport of water substance
– convective-scale water sources/sinks (phase change)
Model Physical Parameterizations
Physical processes breakdown:
• Moist Processes
– Moist convection, shallow convection, large scale
condensation
• Radiation and Clouds
– Cloud parameterization, radiation
• Surface Fluxes
– Fluxes from land, ocean and sea ice (from data or models)
• Turbulent mixing
– Planetary boundary layer parameterization, vertical
diffusion, gravity wave drag
Process Models and Parameterization
•Boundary Layer
•Clouds
Stratiform
Convective
•Microphysics
Evolution of
Global Climate
Models (GCMs)
… increasing complexity.
Due to demand
(want/need to model
more complex
systems)
Increased computing
power enables more
complex models
http://www.usgcrp.gov/usgcrp/images/ocp2003/ocpfy2003-fig3-4.htm
The past, present and future of climate models
During the last 25
years, different
components are added
to the climate model to
better represent our
climate system
Grid Discretizations
Equations are distributed on a sphere
• Different grid approaches:
– Rectilinear (lat-lon)
– Reduced grids
– ‘equal area grids’: icosahedral, cubed sphere
– Spectral transforms
• Different numerical methods for solution:
– Spectral Transforms
– Finite element
– Lagrangian (semi-lagrangian)
• Vertical Discretization
– Terrain following (sigma)
– Pressure
– Isentropic
– Hybrid Sigma-pressure (most common)
The heart of
Computational Fluid Dynamics
(CFD)
Different time and spacial scales
Macroscopic properties
intermingling with
macroscopic properties
Fast processes
(ex. Molecular reactions)
Interacting with
Very slow process
(ex. Transport/movement of molecules
To other regions)
This often makes mathematically solving the problems
very difficult
1.
How did I get here?
~106 m - 1m
~107 m ~105 m
~103 m
The planetary scale
Cloud cluster scale
Cloud scaleCloud microphysical
scale
Scales of Atmospheric Motions/Processes
Anthes et al.
Resolved Scales
Global Models
Future Global Models
Cloud/Mesoscale/Turbulence Models
Cloud Drops
Microphysics
CHEMISTRY
10 m 100 m 1 km 10 km 100 km 1000 km 10000 km
turbulence Cumulus
clouds
Cumulonimbus
clouds
Mesoscale
Convective systems
Extratropical
Cyclones
Planetary
waves
Large Eddy Simulation (LES)
Model Cloud System Resolving Model (CSRM)
Numerical Weather Prediction (NWP) Model
Global Climate Model
No single model can encompass all relevant processes
DNS
mm
Cloud
microphysics
Knowledge Representation
Abstraction
You choose how to represent reality
The choice is not unique
It depends on what aspect of reality you want to represent and how
Applications:
Acquisition, management and use of
knowledge
• Storage and management of Information
• Making Sense of Knowledge
• Acquisition of knowledge
– Feature Acquistion
– Concept Abstraction
• Problem Solving
• Use of knowledge in and as models
– Problem Solving
– Simulations
Storage and management of
Information
Name Address Parcel #
John Smith 18 Lawyers Dr.756554
T. Brown 14 Summers Tr. 887419
Table A
Table BParcel # Assessed Value
887419 152,000
446397 100,000
Making Sense of Knowledge
Concept
conceptual entity of the domain
Attribute
property of a concept
Relation
relationship between concepts
or properties
Axiom
coherent description between
Concepts / Properties /
Relations via logical expressions
Person
Student Professor
Lecture
isA – hierarchy (taxonomy)
name email
student
nr.
research
field
topic
lecture
nr.
attends
holds
Acquisition of knowledge:
Feature Acquistion
“flat” region:
no change in all
directions
“edge”:
no change along
the edge direction
“corner”:
significant change
in all directions
From a square sampling of pixels
Acquisition of knowledge:
Concept Abstraction
P(X|C1)*P(C1) > P(X|C2) *P(C2)
X will buy a computer
Abdomen Length > 7.1?
no yes
KatydidAntenna Length > 6.0?
no yes
KatydidGrasshopper
Use of knowledge in and as models
Problem Solving
L M N O P
G
Q
H JI K
FED
B C
A
Breadth first search Queue
Depth first search Stack
Best first search Priority Queue
Use of knowledge in and as models
Simulations
Applications
You choose how to represent reality

More Related Content

What's hot

Identification of Relevant Sections in Web Pages Using a Machine Learning App...
Identification of Relevant Sections in Web Pages Using a Machine Learning App...Identification of Relevant Sections in Web Pages Using a Machine Learning App...
Identification of Relevant Sections in Web Pages Using a Machine Learning App...Jerrin George
 
Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning ParrotAI
 
Been Kim - Interpretable machine learning, Nov 2015
Been Kim - Interpretable machine learning, Nov 2015Been Kim - Interpretable machine learning, Nov 2015
Been Kim - Interpretable machine learning, Nov 2015Seattle DAML meetup
 
Face Identification Project Abstract 2017
Face Identification Project Abstract 2017Face Identification Project Abstract 2017
Face Identification Project Abstract 2017ioshean
 
Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Aalto University
 
Introduction to Computer Science
Introduction to Computer ScienceIntroduction to Computer Science
Introduction to Computer ScienceChetan Khatri
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...Aalto University
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningHaptik
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender andYuheng Wang
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesAbhishekKumar4995
 
De duplication of entities with-in a cluster using image matching
De duplication of entities with-in a cluster using image matchingDe duplication of entities with-in a cluster using image matching
De duplication of entities with-in a cluster using image matchingSaurabh Singh
 

What's hot (13)

Identification of Relevant Sections in Web Pages Using a Machine Learning App...
Identification of Relevant Sections in Web Pages Using a Machine Learning App...Identification of Relevant Sections in Web Pages Using a Machine Learning App...
Identification of Relevant Sections in Web Pages Using a Machine Learning App...
 
Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning
 
Been Kim - Interpretable machine learning, Nov 2015
Been Kim - Interpretable machine learning, Nov 2015Been Kim - Interpretable machine learning, Nov 2015
Been Kim - Interpretable machine learning, Nov 2015
 
Machine learning
Machine learningMachine learning
Machine learning
 
Face Identification Project Abstract 2017
Face Identification Project Abstract 2017Face Identification Project Abstract 2017
Face Identification Project Abstract 2017
 
Find your interest
Find your interestFind your interest
Find your interest
 
Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"
 
Introduction to Computer Science
Introduction to Computer ScienceIntroduction to Computer Science
Introduction to Computer Science
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender and
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
De duplication of entities with-in a cluster using image matching
De duplication of entities with-in a cluster using image matchingDe duplication of entities with-in a cluster using image matching
De duplication of entities with-in a cluster using image matching
 

Viewers also liked

Job analysis of a reporter
Job analysis of a reporterJob analysis of a reporter
Job analysis of a reporterAbdul Aslam
 
Android tutorial (2)
Android tutorial (2)Android tutorial (2)
Android tutorial (2)Kumar
 
Extracting data from xml
Extracting data from xmlExtracting data from xml
Extracting data from xmlKumar
 
Leadership
LeadershipLeadership
LeadershipKumar
 
Informatica PowerAnalyzer 4.0 2 of 3
Informatica PowerAnalyzer 4.0 2 of 3Informatica PowerAnalyzer 4.0 2 of 3
Informatica PowerAnalyzer 4.0 2 of 3ganblues
 
Relational algebra1
Relational algebra1Relational algebra1
Relational algebra1Tianlu Wang
 
Cloud Computing
 Cloud Computing Cloud Computing
Cloud ComputingAbdul Aslam
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 
Android structure
Android structureAndroid structure
Android structureKumar
 
Informatica PowerAnalyzer 4.0 3 of 3
Informatica PowerAnalyzer 4.0 3 of 3Informatica PowerAnalyzer 4.0 3 of 3
Informatica PowerAnalyzer 4.0 3 of 3ganblues
 
Triggers
TriggersTriggers
Triggerswork
 
Software Testing Tool Report
Software Testing Tool ReportSoftware Testing Tool Report
Software Testing Tool ReportAbdul Aslam
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XMLKumar
 
Mendelian Randomisation
Mendelian RandomisationMendelian Randomisation
Mendelian RandomisationJames McMurray
 
Zackman frame work
Zackman frame workZackman frame work
Zackman frame workganblues
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouseganblues
 

Viewers also liked (19)

Chain Reactions
Chain ReactionsChain Reactions
Chain Reactions
 
Job analysis of a reporter
Job analysis of a reporterJob analysis of a reporter
Job analysis of a reporter
 
Android tutorial (2)
Android tutorial (2)Android tutorial (2)
Android tutorial (2)
 
Spm report
Spm reportSpm report
Spm report
 
Extracting data from xml
Extracting data from xmlExtracting data from xml
Extracting data from xml
 
Leadership
LeadershipLeadership
Leadership
 
Informatica PowerAnalyzer 4.0 2 of 3
Informatica PowerAnalyzer 4.0 2 of 3Informatica PowerAnalyzer 4.0 2 of 3
Informatica PowerAnalyzer 4.0 2 of 3
 
Relational algebra1
Relational algebra1Relational algebra1
Relational algebra1
 
Cloud Computing
 Cloud Computing Cloud Computing
Cloud Computing
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
Android structure
Android structureAndroid structure
Android structure
 
Ch03
Ch03Ch03
Ch03
 
Informatica PowerAnalyzer 4.0 3 of 3
Informatica PowerAnalyzer 4.0 3 of 3Informatica PowerAnalyzer 4.0 3 of 3
Informatica PowerAnalyzer 4.0 3 of 3
 
Triggers
TriggersTriggers
Triggers
 
Software Testing Tool Report
Software Testing Tool ReportSoftware Testing Tool Report
Software Testing Tool Report
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Mendelian Randomisation
Mendelian RandomisationMendelian Randomisation
Mendelian Randomisation
 
Zackman frame work
Zackman frame workZackman frame work
Zackman frame work
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 

Similar to Applications

Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data scienceTanujaSomvanshi1
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
Deep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptxDeep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptxJawadHaider36
 
06 image features
06 image features06 image features
06 image featuresankit_ppt
 
Chapter – 2 Data Models.pdf
Chapter – 2 Data Models.pdfChapter – 2 Data Models.pdf
Chapter – 2 Data Models.pdfTamiratDejene1
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfMAJDABDALLAH3
 
Exploring Data (1).pptx
Exploring Data (1).pptxExploring Data (1).pptx
Exploring Data (1).pptxgina458018
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
 
Lecture-8-The-GIS-Database-Part-1.ppt
Lecture-8-The-GIS-Database-Part-1.pptLecture-8-The-GIS-Database-Part-1.ppt
Lecture-8-The-GIS-Database-Part-1.pptPrabin Pandit
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareTigerGraph
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfRAJVEERKUMAR41
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
 
Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Pramoda Raj
 
Summer Training Project On Data Structure & Algorithms
Summer Training Project On Data Structure & AlgorithmsSummer Training Project On Data Structure & Algorithms
Summer Training Project On Data Structure & AlgorithmsKAUSHAL KUMAR JHA
 

Similar to Applications (20)

Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data science
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
17329274.ppt
17329274.ppt17329274.ppt
17329274.ppt
 
Deep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptxDeep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptx
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
06 image features
06 image features06 image features
06 image features
 
Chapter – 2 Data Models.pdf
Chapter – 2 Data Models.pdfChapter – 2 Data Models.pdf
Chapter – 2 Data Models.pdf
 
Exploring Data
Exploring DataExploring Data
Exploring Data
 
Exploring Data
Exploring DataExploring Data
Exploring Data
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdf
 
Exploring Data (1).pptx
Exploring Data (1).pptxExploring Data (1).pptx
Exploring Data (1).pptx
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
Lecture-8-The-GIS-Database-Part-1.ppt
Lecture-8-The-GIS-Database-Part-1.pptLecture-8-The-GIS-Database-Part-1.ppt
Lecture-8-The-GIS-Database-Part-1.ppt
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdf
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Data models in geographical information system(GIS)
Data models in geographical information system(GIS)
 
Summer Training Project On Data Structure & Algorithms
Summer Training Project On Data Structure & AlgorithmsSummer Training Project On Data Structure & Algorithms
Summer Training Project On Data Structure & Algorithms
 
SVD.ppt
SVD.pptSVD.ppt
SVD.ppt
 

More from Edward Blurock

KEOD23-JThermodynamcsCloud
KEOD23-JThermodynamcsCloudKEOD23-JThermodynamcsCloud
KEOD23-JThermodynamcsCloudEdward Blurock
 
BlurockPresentation-KEOD2023
BlurockPresentation-KEOD2023BlurockPresentation-KEOD2023
BlurockPresentation-KEOD2023Edward Blurock
 
ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017Edward Blurock
 
ChemConnect: SMARTCATS presentation
ChemConnect: SMARTCATS presentationChemConnect: SMARTCATS presentation
ChemConnect: SMARTCATS presentationEdward Blurock
 
EU COST Action CM1404: WG€ - Efficient Data Exchange
EU COST Action CM1404: WG€ - Efficient Data ExchangeEU COST Action CM1404: WG€ - Efficient Data Exchange
EU COST Action CM1404: WG€ - Efficient Data ExchangeEdward Blurock
 
ChemConnect: Viewing the datasets in the repository
ChemConnect: Viewing the datasets in the repositoryChemConnect: Viewing the datasets in the repository
ChemConnect: Viewing the datasets in the repositoryEdward Blurock
 
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...Edward Blurock
 
Poster: Characterizing Ignition behavior through morphing to generic curves
Poster: Characterizing Ignition behavior through morphing to generic curvesPoster: Characterizing Ignition behavior through morphing to generic curves
Poster: Characterizing Ignition behavior through morphing to generic curvesEdward Blurock
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data ProjectEdward Blurock
 
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISAT
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISATPoster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISAT
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISATEdward Blurock
 
Characterization Ignition Behavior through Morphing to Generic Ignition Curves
Characterization Ignition Behavior through Morphing to Generic Ignition CurvesCharacterization Ignition Behavior through Morphing to Generic Ignition Curves
Characterization Ignition Behavior through Morphing to Generic Ignition CurvesEdward Blurock
 
Computability, turing machines and lambda calculus
Computability, turing machines and lambda calculusComputability, turing machines and lambda calculus
Computability, turing machines and lambda calculusEdward Blurock
 
Imperative programming
Imperative programmingImperative programming
Imperative programmingEdward Blurock
 
Database normalization
Database normalizationDatabase normalization
Database normalizationEdward Blurock
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstractionEdward Blurock
 

More from Edward Blurock (20)

KEOD23-JThermodynamcsCloud
KEOD23-JThermodynamcsCloudKEOD23-JThermodynamcsCloud
KEOD23-JThermodynamcsCloud
 
BlurockPresentation-KEOD2023
BlurockPresentation-KEOD2023BlurockPresentation-KEOD2023
BlurockPresentation-KEOD2023
 
KEOD-2023-Poster.pptx
KEOD-2023-Poster.pptxKEOD-2023-Poster.pptx
KEOD-2023-Poster.pptx
 
ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017ChemConnect: Poster for European Combustion Meeting 2017
ChemConnect: Poster for European Combustion Meeting 2017
 
ChemConnect: SMARTCATS presentation
ChemConnect: SMARTCATS presentationChemConnect: SMARTCATS presentation
ChemConnect: SMARTCATS presentation
 
EU COST Action CM1404: WG€ - Efficient Data Exchange
EU COST Action CM1404: WG€ - Efficient Data ExchangeEU COST Action CM1404: WG€ - Efficient Data Exchange
EU COST Action CM1404: WG€ - Efficient Data Exchange
 
ChemConnect: Viewing the datasets in the repository
ChemConnect: Viewing the datasets in the repositoryChemConnect: Viewing the datasets in the repository
ChemConnect: Viewing the datasets in the repository
 
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
 
Poster: Characterizing Ignition behavior through morphing to generic curves
Poster: Characterizing Ignition behavior through morphing to generic curvesPoster: Characterizing Ignition behavior through morphing to generic curves
Poster: Characterizing Ignition behavior through morphing to generic curves
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
 
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISAT
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISATPoster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISAT
Poster: Adaptive On-­‐the-­‐fly Regression Tabula@on: Beyond ISAT
 
Characterization Ignition Behavior through Morphing to Generic Ignition Curves
Characterization Ignition Behavior through Morphing to Generic Ignition CurvesCharacterization Ignition Behavior through Morphing to Generic Ignition Curves
Characterization Ignition Behavior through Morphing to Generic Ignition Curves
 
Paradigms
ParadigmsParadigms
Paradigms
 
Computability, turing machines and lambda calculus
Computability, turing machines and lambda calculusComputability, turing machines and lambda calculus
Computability, turing machines and lambda calculus
 
Imperative programming
Imperative programmingImperative programming
Imperative programming
 
Programming Languages
Programming LanguagesProgramming Languages
Programming Languages
 
Relational algebra
Relational algebraRelational algebra
Relational algebra
 
Database normalization
Database normalizationDatabase normalization
Database normalization
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstraction
 
Overview
OverviewOverview
Overview
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 

Recently uploaded (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 

Applications

  • 2. Applications Pallet of Data Structures Algorithms Choose which to use and Combine them To form your model Of reality Reality to modelThe modelerThe computer You are the artist and the computer is your canvas
  • 3. Knowledge Representation Abstraction You choose how to represent reality The choice is not unique It depends on what aspect of reality you want to represent and how
  • 4. Applications: Acquisition, management and use of knowledge Theme of lecture: Abstractionof reality through knowledge engineering
  • 5. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Problem Solving • Use of knowledge in and as models – Problem Solving – Simulations
  • 6. Storing and Managing Information Table of data Database management Systems (DBMS) Storage and retrieval of properties of objects Spreadsheets Manipulations of and calculations with the data in the table Each row is a particular object Each column is a property associated with that objects Two examples/paradigms of management systems
  • 7. Database Management System (DBMS) Organizes data in sets of tables
  • 8. Relational Database Management System (RDBMS) Name Address Parcel # John Smith 18 Lawyers Dr.756554 T. Brown 14 Summers Tr. 887419 Table A Table B Parcel # Assessed Value 887419 152,000 446397 100,000 Provides relationships Between data in the tables
  • 9. Using SQL- Structured Query Language • SQL is a standard database protocol, adopted by most ‘relational’ databases • Provides syntax for data: – Definition – Retrieval – Functions (COUNT, SUM, MIN, MAX, etc) – Updates and Deletes • SELECT list FROM table WHERE condition • list - a list of items or * for all items o WHERE - a logical expression limiting the number of records selected o can be combined with Boolean logic: AND, OR, NOT o ORDER may be used to format results
  • 10. Spreadsheets Every row is a different “object” with a set of properties Every column is a different property of the row object
  • 11. Spreadsheet Organization of elements Column A Column B Column C Row 1 Row 2 Row 3 Row and column indicies Cells with addresses A7 B4 C10 D5 Accessing each cell
  • 12. Spreadsheet Formulas Formula: Combination of values or cell references and mathematical operators such as +, -, /, * The formula displays in the entry bar. This formula is used to add the values in the four cells. The sum is displayed in cell B7. The results of a formula display in the cell. With cell, row and column functions Ex. Average, sum, min,max,
  • 14. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Use of knowledge in and as models – Problem Solving – Simulations
  • 15. Making Sense of Knowledge Time flies like an arrow proverb Fruit flies like a banana Groucho Marx There is a semantic and context behind all words Flies: 1. The act of flying 2. The insect Like: 1. Similar to 2. Are fond of There is also the elusive “Common Sense” 1. One type of fly, the fruit fly, is fond of bananas 2. Fruit, in general, flies through the air just like a banana 3. One type of fly, the fruit fly, is just like a banana A bit complicated because we are speaking metaphorically, Time is not really an object, like a bird, which flies Translation is not just doing a one-to-one search in the dictionary Complex Searches is not just searching for individual words Google translate
  • 16. Adding Semantics: Ontologies Concept conceptual entity of the domain Attribute property of a concept Relation relationship between concepts or properties Axiom coherent description between Concepts / Properties / Relations via logical expressions 16 Person Student Professor Lecture isA – hierarchy (taxonomy) name email student nr. research field topic lecture nr. attends holds Structuring of: • Background Knowledge • “Common Sense” knowledge
  • 17. Structure of an Ontology Ontologies typically have two distinct components: Names for important concepts in the domain – Elephant is a concept whose members are a kind of animal – Herbivore is a concept whose members are exactly those animals who eat only plants or parts of plants – Adult_Elephant is a concept whose members are exactly those elephants whose age is greater than 20 years Background knowledge/constraints on the domain – Adult_Elephants weigh at least 2,000 kg – All Elephants are either African_Elephants or Indian_Elephants – No individual can be both a Herbivore and a Carnivore 17
  • 18. Ontology Definition 18 Formal, explicit specification of a shared conceptualization commonly accepted understanding conceptual model of a domain (ontological theory) unambiguous terminology definitions machine-readability with computational semantics [Gruber93]
  • 19. The Semantic Web Ontology implementation 19 "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee “the wedding cake”
  • 20. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquisition – Concept Abstraction • Use of knowledge in and as models – Problem Solving – Simulations
  • 21. Abstracting Knowledge Several levels and reasons to abstract knowledge Feature abstraction Simplifying “reality” so the know can be used in Computer data structures and algorithms Concept Abstraction Organizing and making sense of the immense amount of data/knowledge we have Modeling abstraction Making usable and predictive models of reality
  • 22. Feature Abstraction Simplifying “reality” so the knowledge can be used in Computer data structures and algorithms A photograph of a face Set Of pixels Is it a face? Who’s face?
  • 23. Feature Abstraction Simplifying “reality” so the knowledge can be used in Computer data structures and algorithms A photograph of a face Is it a face? Who’s face? The eye sees the pixels In the visual cortex, Features are detected
  • 24. Feature Abstraction Simplifying “reality” so the knowledge can be used in Computer data structures and algorithms n!º 1 n =1 n*(n-1)! n >1 ì í ï îï ü ý ï þï 43210 5 76 8 90 1 2 3 4 5 Photograph made up of pixels The pixels need to be converted to Data structures the algorithms can understand
  • 26. Feature Detection “flat” region: no change in all directions “edge”: no change along the edge direction “corner”: significant change in all directions Harris Detector: Intuition From a square sampling of pixels
  • 27. Principle Component Analysis (PCA) 27 • Finding a map of principle components (PCs) of data into an orthogonal space • Method: Find the set of eigenvalues in a vector space: – The eigen vectors are the principle components – The eigenvalues are the ranking of the vectors • PCs – Variables with the largest variances – Orthogonality (each coordinate is orthogonal) – Linearity – Optimal least mean-square error • Limitations? – Strict linearity – specific distribution – Large variance assumption x1 x2 Rotates coordinate system
  • 28. Feature Detection  ( , ) , u E u v u v M v        Intensity change in shifting window: eigenvalue analysis 1, 2 – eigenvalues of M direction of the slowest change direction of the fastest change (max)-1/2 (min)-1/2 Ellipse E(u,v) = const Harris Detector: Mathematics of the analysis of pixels Transformation of coordinates Principle component analysis
  • 29. Can reduce the set of coordinates One coordinate The other coordinate is noise (all points are “shifted” to the Principle component)
  • 30. Harris Detector: Mathematics 1 2 “Corner” 1 and 2 are large, 1 ~ 2; E increases in all directions 1 and 2 are small; E is almost constant in all directions “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region Classification of the new coordinates
  • 31. PCA: Feature from pixels 1 2 “Corner” 1 and 2 are large, 1 ~ 2; E increases in all directions “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region One principle component Along the line The other component is small Note that line can be in any direction Principle component follows line Rotation invariant
  • 32. 1 2 “Corner” 1 and 2 are large, 1 ~ 2; E increases in all directions “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region PCA: Feature from pixels There is no line No principle component
  • 33. PCA: Feature from pixels 1 2 “Corner” 1 and 2 are large, 1 ~ 2; E increases in all directions “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region There are two lines (almost) in orthogonal (perpendicular) Directions Two principle components
  • 34. Feature Detection Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to image rotation Important property: Rotationally invariant
  • 35. SIFT Descriptor • 16x16 Gradient window is taken. Partitioned into 4x4 subwindows. • Histogram of 4x4 samples in 8 directions • Gaussian weighting around center( is 0.5 times that of the scale of a keypoint) • 4x4x8 = 128 dimensional feature vector  Another localized feature from the pixels
  • 36. Feature Detection • Use the scale/orientation to determined by detector to in a normalized frame. • compute a descriptor in this frame. Scale example: • moments integrated over an adapted window • derivatives adapted to scale: sIx Scale & orientation example: Resample all points/regions to 11X11 pixels • PCA coefficients •Principle components of all points. SIFT Descriptors also invariant to Scale/Orientation
  • 37. Feature Abstraction Simplifying “reality” so the knowledge can be used in Computer data structures and algorithms n!º 1 n =1 n*(n-1)! n >1 ì í ï îï ü ý ï þï 43210 5 76 8 90 1 2 3 4 5 New “features” represented in data structures that can be used in algorithms
  • 38. Hierarchy of analysis Hierarchy of features Simple primitive features Complex combinations of simple features Face detection
  • 39. Example: Face Detection • Scan window over image • Classify window as either: – Face – Non-face ClassifierWindow Face Non-face From the established features
  • 40. Face Detection Algorithm Face Localization Lighting Compensation Skin Color Detection Color Space Transformation Variance-based Segmentation Connected Component & Grouping Face Boundary Detection Verifying/ Weighting Eyes-Mouth Triangles Eye/ Mouth Detection Facial Feature Detection Input Image Output Image
  • 41. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Use of knowledge in and as models – Problem Solving – Simulations
  • 42. Concept Abstraction Organizing and making sense of the immense amount of data/knowledge we have Generalization The ability of an algorithm to perform accurately on new, unseen examples after having trained on a learning data set
  • 43. Generalization Consider the following regression problem: Predict real value on the y-axis from the real value on the x-axis. You are given 6 examples: {Xi,Yi}. X* What is the y-value for a new query ?
  • 44. Generalization X* What is the y-value for a new query ?
  • 45. Generalization X* What is the y-value for a new query ?
  • 46. Generalization which curve is best? X* What is the y-value for a new query ?
  • 47. Generalization Occam’s razor: prefer the simplest hypothesis consistent with data. Have to find a balance of constraints
  • 48. Two Schools of Thought 48 1. Statistical “Learning” The data is reduced to vectors of numbers Statistical techniques are used for the tasks to be performed. 2. Structural “Learning” The data is converted to a discrete structure (such as a grammar or a graph) and the techniques are related to computer science subjects (such as parsing and graph matching).
  • 49. A spectrum of machine learning tasks • High-dimensional data (e.g. more than 100 dimensions) • The noise is not sufficient to obscure the structure in the data if we process it right. • There is a huge amount of structure in the data, but the structure is too complicated to be represented by a simple model. • The main problem is figuring out a way to represent the complicated structure that allows it to be learned. • Low-dimensional data (e.g. less than 100 dimensions) • Lots of noise in the data • There is not much structure in the data, and what structure there is, can be represented by a fairly simple model. • The main problem is distinguishing true structure from noise. Statistics--------------------- Artificial Intelligence
  • 51. learning with the presence of an expert Data is labelled with a class or value Goal:: predict class or value label c1 c2 c3 Supervised Learning Learn a properties of a classification Decision making Predict (classify) sample → discrete set of class labels e.g. C = {object 1, object 2 … } for recognition task e.g. C = {object, !object} for detection task Spa m No- Spam
  • 52. learning without the presence of an expert Data is unlabelled with a class or value Goal:: determine data patterns/groupings and the properties of that classification Unsupervised Learning Association or clustering:: grouping a set of instances by attribute similarity e.g. image segmentation Key concept: Similarity
  • 53. Statistical Methods Regression:: Predict sample → associated real (continuous) value e.g. data fitting x 1 x 2 Learning within the constraints of the method Data is basically n-dimensional set of numerical attributes Deterministic/Mathematical algorithms based on probability distributions Principle Component Analysis:: Transform to a new (simpler) set of coordinates e.g. find the major component of the data
  • 54. Pattern Recognition Another name for machine learning • A pattern is an object, process or event that can be given a name. • A pattern class (or category) is a set of patterns sharing common attributes and usually originating from the same source. • During recognition (or classification) given objects are assigned to prescribed classes. • A classifier is a machine which performs classification. “The assignment of a physical object or event to one of several prespecified categeries” -- Duda & Hart
  • 55. Cross-Validation In the mathematics of statistics A mathematical definition of the error Function of the probability distribution Average Standard deviation In machine learning, no such distribution exists Full Data set Training set Test set Build the ML Data structure Determine Error
  • 56. Classification algorithms – Fisher linear discriminant – KNN – Decision tree – Neural networks – SVM – Naïve bayes – Adaboost – Many many more …. – Each one has its properties wrt bias, speed, accuracy, transparency…
  • 57. Feature extraction Task: to extract features which are good for classification. Good features: • Objects from the same class have similar feature values. • Objects from different classes have different values. “Good” features “Bad” features
  • 58. Similarity Two objects belong to the same classification If The are “close” x1 x2 ? ? ? ? ? Distance between them is small Need a function F(object1, object1) = “distance” between them
  • 59. Similarity measure Distance metric • How do we measure what it means to be “close”? • Depending on the problem we should choose an appropriate distance metric. For example: Least squares distance f (a,b) = (ai -bi )2 i=1 n å
  • 60. Types of Model Discriminative Generative Generative vs. Discriminative
  • 61. Overfitting and underfitting Problem: how rich class of classifications q(x;θ) to use. underfitting overfittinggood fit Problem of generalization: a small emprical risk Remp does not imply small true expected risk R.
  • 62. Generative Cluster Analysis Create “clusters” Depending on distance metric Hierarchial Based on “how close” Objects are
  • 63. KNN – K nearest neighbors x1 x2 ? ? ? ? – Find the k nearest neighbors of the test example , and infer its class using their known class. – E.g. K=3 ?
  • 64. Discrimitive: Support Vector Machine • Q: How to draw the optimal linear separating hyperplane?  A: Maximizing margin • Margin maximization – The distance between H+1 and H-1: – Thus, ||w|| should be minimized 64 Margin
  • 65. Prediction Based on Bayes’ Theorem • Given training data X, posteriori probability of a hypothesis H, P(H|X), follows the Bayes’ theorem • Informally, this can be viewed as posteriori = likelihood x prior/evidence • Predicts X belongs to Ci iff the probability P(Ci|X) is the highest among all the P(Ck|X) for all the k classes • Practical difficulty: It requires initial knowledge of many probabilities, involving significant computational cost 65 )(/)()|( )( )()|()|( XX X XX PHPHP P HPHPHP 
  • 66. Naïve Bayes Classifier age income studentcredit_ratingbuys_comput <=30 high no fair no <=30 high no excellent no 31…40 high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no 31…40 low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes 31…40 medium no excellent yes 31…40 high yes fair yes >40 medium no excellent no 66 Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ P(buys_computer = “yes”) = 9/14 = 0.643 P(buys_computer = “no”) = 5/14= 0.357 X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
  • 67. Naïve Bayes Classifier age income studentcredit_ratingbuys_comput <=30 high no fair no <=30 high no excellent no 31…40 high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no 31…40 low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes 31…40 medium no excellent yes 31…40 high yes fair yes >40 medium no excellent no 67 Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Want to classify X = (age <= 30 , income = medium, student = yes, credit_rating = fair) Will X buy a computer?
  • 68. Naïve Bayes Classifier 68 Key: Conditional probability P(X|Y) The probability that X is true, given Y P(not rain| sunny) > P(rain | sunny) P(not rain| not sunny) < P(rain | not sunny) Classifier: Have to include the probability of the condition P(not rain | sunny)*P(sunny) How often did it really not rain, given that it was actually sunny
  • 69. Naïve Bayes Classifier 69 Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Want to classify X = (age <= 30 , income = medium, student = yes, credit_rating = fair) Will X buy a computer? Which “conditional probability” is greater? P(X|C1)*P(C1) > P(X|C2) *P(C2) X will buy a computer P(X|C1) *P(C1) < P(X|C2) *P(C2) X will not buy a computer
  • 70. Naïve Bayes Classifier age income studentcredit_ratingbuys_comput <=30 high no fair no <=30 high no excellent no 31…40 high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no 31…40 low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes 31…40 medium no excellent yes 31…40 high yes fair yes >40 medium no excellent no 70 Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ X = (age <= 30 , income = medium, student = yes, credit_rating = fair) P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6
  • 71. Naïve Bayes Classifier • Compute P(X|Ci) for each class P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667 P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4 71
  • 72. Naïve Bayes Classifier P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044 P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019 P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028 P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007 Therefore, X belongs to class (“buys_computer = yes”) Bigger
  • 73. Decision Tree Classifier Ross Quinlan AntennaLength 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 Abdomen Length Abdomen Length > 7.1? no yes KatydidAntenna Length > 6.0? no yes KatydidGrasshopper
  • 74. Grasshopper Antennae shorter than body? Cricket Foretiba has ears? Katydids Camel Cricket Yes Yes Yes No No 3 Tarsi? No Decision trees predate computers
  • 75. • Decision tree – A flow-chart-like tree structure – Internal node denotes a test on an attribute – Branch represents an outcome of the test – Leaf nodes represent class labels or class distribution • Decision tree generation consists of two phases – Tree construction • At start, all the training examples are at the root • Partition examples recursively based on selected attributes – Tree pruning • Identify and remove branches that reflect noise or outliers • Use of decision tree: Classifying an unknown sample – Test the attribute values of the sample against the decision tree Decision Tree Classification
  • 76. • Basic algorithm (a greedy algorithm) – Tree is constructed in a top-down recursive divide-and-conquer manner – At start, all the training examples are at the root – Attributes are categorical (if continuous-valued, they can be discretized in advance) – Examples are partitioned recursively based on selected attributes. – Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain) • Conditions for stopping partitioning – All samples for a given node belong to the same class – There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf – There are no samples left How do we construct the decision tree?
  • 77. Information Gain as A Splitting Criteria • Select the attribute with the highest information gain (information gain is the expected reduction in entropy). • Assume there are two classes, P and N – Let the set of examples S contain p elements of class P and n elements of class N – The amount of information, needed to decide if an arbitrary example in S belongs to P or N is defined as                np n np n np p np p SE 22 loglog)( 0 log(0) is defined as 0
  • 78. nformation Gain in Decision Tree Induction • Assume that using attribute A, a current set will be partitioned into some number of child sets • The encoding information that would be gained by branching on A )()()( setschildallEsetCurrentEAGain  Note: entropy is at its minimum if the collection of objects is completely uniform
  • 79. Person Hair Length Weight Age Class Homer 0” 250 36 M Marge 10” 150 34 F Bart 2” 90 10 M Lisa 6” 78 8 F Maggie 4” 20 1 F Abe 1” 170 70 M Selma 8” 160 41 F Otto 10” 180 38 M Krusty 6” 200 45 M Comic 8” 290 38 ?
  • 80. Hair Length <= 5? yes no Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911                np n np n np p np p SEntropy 22 loglog)( Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911 )()()( setschildallEsetCurrentEAGain  Let us try splitting on Hair length
  • 81. Weight <= 160? yes no Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911                np n np n np p np p SEntropy 22 loglog)( Gain(Weight <= 160) = 0.9911 – (5/9 * 0.7219 + 4/9 * 0 ) = 0.5900 )()()( setschildallEsetCurrentEAGain  Let us try splitting on Weight
  • 82. age <= 40? yes no Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911                np n np n np p np p SEntropy 22 loglog)( Gain(Age <= 40) = 0.9911 – (6/9 * 1 + 3/9 * 0.9183 ) = 0.0183 )()()( setschildallEsetCurrentEAGain  Let us try splitting on Age
  • 83. Weight <= 160? yes no Hair Length <= 2? yes no Of the 3 features we had, Weight was best. But while people who weigh over 160 are perfectly classified (as males), the under 160 people are not perfectly classified… So we simply recurse! This time we find that we can split on Hair length, and we are done!
  • 84. Weight <= 160? yes no Hair Length <= 2? yes no We need don’t need to keep the data around, just the test conditions. Male Male Female How would these people be classified?
  • 85. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Use of knowledge in and as models – Problem Solving – Simulation
  • 86. Using Knowledge Problem Solving Simulations Searching for a solution Combining models to form a large comprehensive model
  • 87. Problem Solving Basis of the search Order in which nodes are evaluated and expanded Determined by Two Lists OPEN: List of unexpanded nodes CLOSED: List of expanded nodes Searching for a solution through all possible solutions Fundamental algorithm in artificial intelligence Graph Search
  • 88. Abstraction: State of a system chess Tic-tak-toe Water jug problem Traveling salemen’s problem In problem solving: Search for the steps leading to the solution The individual steps are the states of the system
  • 89. Solution Space The set of all states of the problem Including the goal state(s) All possible board combinations All possible reference points All possible combinations
  • 90. Search Space Each system state (nodes) is connected by rules (connections) on how to get from one state to another
  • 91. Search Space How the states are connected Legal moves Paths between points Possible operations
  • 92. Strategies to Search Space of System States • Breath first search • Depth first search • Best first search Determines order in which the states are searched to find solution
  • 93. Breadth-first searching • A breadth-first search (BFS) explores nodes nearest the root before exploring nodes further away • For example, after searching A, then B, then C, the search proceeds with D, E, F, G • Node are explored in the order A B C D E F G H I J K L M N O P Q • J will be found before NL M N O P G Q H JI K FED B C A
  • 94. Depth-first searching • A depth-first search (DFS) explores a path all the way to a leaf before backtracking and exploring another path • For example, after searching A, then B, then D, the search backtracks and tries another path from B • Node are explored in the order A B D E H L M N I O P C F G J K Q • N will be found before JL M N O P G Q H JI K FED B C A
  • 95. Breadth First Search | | | || | | | | | | |||| Items between red bars are siblings. goal is reached or open is empty. Expand A to new nodes B, C, D Expand B to new node E,F Send to back of queue Queue: FILO
  • 96. Depth first Search Expand A to new nodes B, C, D Expand B to new node E,F Send to front of stack Stack: FIFO
  • 97. Best First Search Breadth first search: queue (FILO) Depth first search: stack (FIFO) Uninformed searches: No knowledge of how good the current solution is (are we on the right track?) Best First Search: Priority Queue Associated with each node is a heuristic F(node) = the quality of the node to lead to a final solution
  • 98. A* search • Idea: avoid expanding paths that are already expensive • • Evaluation function f(n) = g(n) + h(n) • • g(n) = cost so far to reach n • h(n) = estimated cost from n to goal • f(n) = estimated total cost of path through n to goal This is the hard/unknown part If h(n) is an underestimate, then the algorithm is guarenteed to find a solution
  • 99. Admissible heuristics • A heuristic h(n) is admissible if for every node n, h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n. • An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic • Example: hSLD(n) (never overestimates the actual road distance) • Theorem: If h(n) is admissible, A* using TREE- SEARCH is optimal
  • 100. Graph Search Several Structures Used Graph Search The graph as search space Breadth first search Queue Depth first search Stack Best first search Priority Queue Stacks and queues, depending on search strategy
  • 101. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Use of knowledge in and as models – Problem Solving – Simulations
  • 103. Climate Model Climate Modeling A multitude of sub-models submodel submodel submodel submodelsubmodel submodel submodel submodelsubmodel submodelsubmodel submodel submodel submodel Many stemming from the techniques discussed previously
  • 104. Physical processes regulating climate Physical models representing all the interactions that can occur
  • 105. Radiation Even one physical quantity can have many source models, sink models and interaction models
  • 106. “Earth System Model” And ocean model, sea-ice model, land surface model, etc… 3D atmosphere 3D ocean 2D sea ice Atmospheric CO2 2D land surface Land biogeochemi stry Ocean biogeochem istry Ocean sediments 3D ice sheets
  • 108. Meteorological Primitive Equations • Applicable to wide scale of motions; > 1hour, >100km
  • 109. Global Climate Model Physics Terms F, Q, and Sq represent physical processes • Equations of motion, F – turbulent transport, generation, and dissipation of momentum • Thermodynamic energy equation, Q – convective-scale transport of heat – convective-scale sources/sinks of heat (phase change) – radiative sources/sinks of heat • Water vapor mass continuity equation – convective-scale transport of water substance – convective-scale water sources/sinks (phase change)
  • 110. Model Physical Parameterizations Physical processes breakdown: • Moist Processes – Moist convection, shallow convection, large scale condensation • Radiation and Clouds – Cloud parameterization, radiation • Surface Fluxes – Fluxes from land, ocean and sea ice (from data or models) • Turbulent mixing – Planetary boundary layer parameterization, vertical diffusion, gravity wave drag
  • 111. Process Models and Parameterization •Boundary Layer •Clouds Stratiform Convective •Microphysics
  • 112. Evolution of Global Climate Models (GCMs) … increasing complexity. Due to demand (want/need to model more complex systems) Increased computing power enables more complex models
  • 113. http://www.usgcrp.gov/usgcrp/images/ocp2003/ocpfy2003-fig3-4.htm The past, present and future of climate models During the last 25 years, different components are added to the climate model to better represent our climate system
  • 114. Grid Discretizations Equations are distributed on a sphere • Different grid approaches: – Rectilinear (lat-lon) – Reduced grids – ‘equal area grids’: icosahedral, cubed sphere – Spectral transforms • Different numerical methods for solution: – Spectral Transforms – Finite element – Lagrangian (semi-lagrangian) • Vertical Discretization – Terrain following (sigma) – Pressure – Isentropic – Hybrid Sigma-pressure (most common) The heart of Computational Fluid Dynamics (CFD)
  • 115. Different time and spacial scales Macroscopic properties intermingling with macroscopic properties Fast processes (ex. Molecular reactions) Interacting with Very slow process (ex. Transport/movement of molecules To other regions) This often makes mathematically solving the problems very difficult
  • 116. 1. How did I get here? ~106 m - 1m ~107 m ~105 m ~103 m The planetary scale Cloud cluster scale Cloud scaleCloud microphysical scale
  • 117. Scales of Atmospheric Motions/Processes Anthes et al. Resolved Scales Global Models Future Global Models Cloud/Mesoscale/Turbulence Models Cloud Drops Microphysics CHEMISTRY
  • 118. 10 m 100 m 1 km 10 km 100 km 1000 km 10000 km turbulence Cumulus clouds Cumulonimbus clouds Mesoscale Convective systems Extratropical Cyclones Planetary waves Large Eddy Simulation (LES) Model Cloud System Resolving Model (CSRM) Numerical Weather Prediction (NWP) Model Global Climate Model No single model can encompass all relevant processes DNS mm Cloud microphysics
  • 119. Knowledge Representation Abstraction You choose how to represent reality The choice is not unique It depends on what aspect of reality you want to represent and how
  • 120. Applications: Acquisition, management and use of knowledge • Storage and management of Information • Making Sense of Knowledge • Acquisition of knowledge – Feature Acquistion – Concept Abstraction • Problem Solving • Use of knowledge in and as models – Problem Solving – Simulations
  • 121. Storage and management of Information Name Address Parcel # John Smith 18 Lawyers Dr.756554 T. Brown 14 Summers Tr. 887419 Table A Table BParcel # Assessed Value 887419 152,000 446397 100,000
  • 122. Making Sense of Knowledge Concept conceptual entity of the domain Attribute property of a concept Relation relationship between concepts or properties Axiom coherent description between Concepts / Properties / Relations via logical expressions Person Student Professor Lecture isA – hierarchy (taxonomy) name email student nr. research field topic lecture nr. attends holds
  • 123. Acquisition of knowledge: Feature Acquistion “flat” region: no change in all directions “edge”: no change along the edge direction “corner”: significant change in all directions From a square sampling of pixels
  • 124. Acquisition of knowledge: Concept Abstraction P(X|C1)*P(C1) > P(X|C2) *P(C2) X will buy a computer Abdomen Length > 7.1? no yes KatydidAntenna Length > 6.0? no yes KatydidGrasshopper
  • 125. Use of knowledge in and as models Problem Solving L M N O P G Q H JI K FED B C A Breadth first search Queue Depth first search Stack Best first search Priority Queue
  • 126. Use of knowledge in and as models Simulations
  • 127. Applications You choose how to represent reality