SlideShare a Scribd company logo
Machine	Learning	Specializa0on	
Nearest Neighbor
Search:
Retrieving Documents
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
Retrieving documents of interest
2 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
Document retrieval
•  Currently reading article you like
©2016	Emily	Fox	&	Carlos	Guestrin		3
Machine	Learning	Specializa0on	
Document retrieval
©2016	Emily	Fox	&	Carlos	Guestrin		
•  Currently reading article you like
•  Goal: Want to find similar article
4
Machine	Learning	Specializa0on	
Document retrieval
©2016	Emily	Fox	&	Carlos	Guestrin		5
Machine	Learning	Specializa0on	
Challenges
•  How do we measure similarity?
•  How do we search over articles?
©2016	Emily	Fox	&	Carlos	Guestrin		6
Machine	Learning	Specializa0on	
Retrieval as k-nearest neighbor search
7 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	8	
1-NN search for retrieval
©2016	Emily	Fox	&	Carlos	Guestrin		
Space of all articles,
organized by similarity of text
query article
Machine	Learning	Specializa0on	9	
Compute distances to all docs
©2016	Emily	Fox	&	Carlos	Guestrin		
Space of all articles,
organized by similarity of text
query article
Machine	Learning	Specializa0on	10	
Retrieve “nearest neighbor”
©2016	Emily	Fox	&	Carlos	Guestrin		
Space of all articles,
organized by similarity of text
query article
nearest neighbor
Machine	Learning	Specializa0on	11	
Or set of nearest neighbors
©2016	Emily	Fox	&	Carlos	Guestrin		
Space of all articles,
organized by similarity of text
query article
set of nearest neighbors
Machine	Learning	Specializa0on	
1-NN algorithm
12 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
1 – Nearest neighbor
•  Input: Query article : xq
Corpus of documents
: x1, x2, …, xN
•  Output: Most similar article
Formally:
©2016	Emily	Fox	&	Carlos	Guestrin		13
Machine	Learning	Specializa0on	
Initialize Dist2NN = ∞,			 = Ø	
For i=1,2,…,N
Compute: δ = distance( , )
If δ < Dist2NN
set =
set Dist2NN = δ
Return most similar document
1-NN algorithm
©2016	Emily	Fox	&	Carlos	Guestrin		
query document
closest document
closest document in
corpus to query article
document i
from corpus
14
Machine	Learning	Specializa0on	
k-NN algorithm
15 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	
k – Nearest neighbor
•  Input: Query article : xq
Corpus of documents
: x1, x2, …, xN
•  Output: List of k similar articles
Formally:
©2016	Emily	Fox	&	Carlos	Guestrin		16
Machine	Learning	Specializa0on	
k-NN algorithm
Initialize Dist2kNN = sort(δ1,…,δk)
= sort( ,…, )
For i=k+1,…,N
Compute: δ = distance( , )
If δ < Dist2kNN[k]
find j such that δ > Dist2kNN[j-1] but δ < Dist2kNN[j]
remove furthest house and shift queue:
[j+1:k] = [j:k-1]
Dist2kNN[j+1:k] = Dist2kNN[j:k-1]
set Dist2kNN[j] = δ and [j] =
Return k most similar articles
©2016	Emily	Fox	&	Carlos	Guestrin		
query doc
closest k docs
to query doc
sort first k documents
by distance to query doc
list of sorted distances
list of sorted docs
1 k
qi
i
17
Machine	Learning	Specializa0on	
Critical elements of NN search
Item (e.g., doc) representation
xq ß
Measure of distance between items:
δ = distance(xi , xq)
©2016	Emily	Fox	&	Carlos	Guestrin		18
Machine	Learning	Specializa0on	
Document representation
19 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	20	
Word count document
representation
©2016	Emily	Fox	&	Carlos	Guestrin		
Bag of words model
- Ignore order of words
- Count # of instances of
each word in vocabulary
“Carlos calls the sport futbol.
Emily calls the sport soccer.”
Machine	Learning	Specializa0on	21	
Issues with word counts –
Rare words
©2016	Emily	Fox	&	Carlos	Guestrin		
Common words in doc: “the”, “player”, “field”, “goal”
Dominate rare words like: “futbol”, “Messi”
Machine	Learning	Specializa0on	22	
TF-IDF document representation
Emphasizes important words
- Appears frequently in document (common locally)
- Appears rarely in corpus (rare globally)
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	23	
TF-IDF document representation
Emphasizes important words
- Appears frequently in document (common locally)
- Appears rarely in corpus (rare globally)
©2016	Emily	Fox	&	Carlos	Guestrin		
Term frequency = word counts
Machine	Learning	Specializa0on	24	
TF-IDF document representation
Emphasizes important words
- Appears frequently in document (common locally)
- Appears rarely in corpus (rare globally)
©2016	Emily	Fox	&	Carlos	Guestrin		
Term frequency =
Inverse doc freq. = log
# docs
1 + # docs using word
word counts
Machine	Learning	Specializa0on	25	
TF-IDF document representation
Emphasizes important words
- Appears frequently in document (common locally)
- Appears rarely in corpus (rare globally)
Trade off: local frequency vs. global rarity
©2016	Emily	Fox	&	Carlos	Guestrin		
Term frequency =
Inverse doc freq. = log
# docs
1 + # docs using word
word counts
tf * idf
Machine	Learning	Specializa0on	
Distance metrics
26 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	27	
Distance metrics:
Defining notion of “closest”
In 1D, just Euclidean distance:
distance(xi,xq) = |xi-xq|
In multiple dimensions:
-  can define many interesting distance functions
-  most straightforwardly, might want to weight
different dimensions differently
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	28	
Weighting different features
Reasons:
- Some features are more relevant than others
©2016	Emily	Fox	&	Carlos	Guestrin		
# bedrooms
# bathrooms
sq.ft. living
sq.ft. lot
floors
year built
year renovated
waterfront
Machine	Learning	Specializa0on	29	
Weighting different features
Reasons:
- Some features are more relevant than others
©2016	Emily	Fox	&	Carlos	Guestrin		
title
abstract
main body
conclusion
Machine	Learning	Specializa0on	30	
Weighting different features
Reasons:
- Some features are more relevant than others
- Some features vary more than others
©2016	Emily	Fox	&	Carlos	Guestrin		
Small changes
matter more
Big changes
matter less
Feature 1
Feature2
Specify weights
as a function of
feature spread
For feature j:
1
maxi(xi[j])-mini(xi[j])
Machine	Learning	Specializa0on	31	
Scaled Euclidean distance
Formally, this is achieved via
distance(xi, xq) =
a1(xi[1]-xq[1])2 + … + ad(xi[d]-xq[d])2
©2016	Emily	Fox	&	Carlos	Guestrin		
p
weight on each feature
(defining relative importance)
Machine	Learning	Specializa0on	32	
Effect of binary weights
©2016	Emily	Fox	&	Carlos	Guestrin		
p
a
distance(xi, xq) =
a1(xi[1]-xq[1])2 + … + ad(xi[d]-xq[d])2
Setting weights as 0 or 1
is equivalent to
feature selection Feature engineering/
selection is
important, but hard
Machine	Learning	Specializa0on	33	
(non-scaled) Euclidean distance
Defined in terms of inner product
distance(xi, xq) =
(xi[1]-xq[1])2 + … + (xi[d]-xq[d])2
©2016	Emily	Fox	&	Carlos	Guestrin		
p
-
=
(xi–xq)T(xi–xq)
p
Machine	Learning	Specializa0on	34	
(non-scaled) Euclidean distance
Defined in terms of inner product
distance(xi, xq) =
(xi[1]-xq[1])2 + … + (xi[d]-xq[d])2
©2016	Emily	Fox	&	Carlos	Guestrin		
distance2 =
p
(xi–xq)T(xi–xq)
p
Machine	Learning	Specializa0on	35	
Scaled Euclidean distance
Defined in terms of inner product
distance(xi, xq) =
a1(xi[1]-xq[1])2 + … + ad(xi[d]-xq[d])2
©2016	Emily	Fox	&	Carlos	Guestrin		
distance2 =
p
(xi–xq)TA(xi–xq)
p
a1
a2
a3
ad
Machine	Learning	Specializa0on	36	
Another natural inner product measure
©2016	Emily	Fox	&	Carlos	Guestrin		
1	 0	 0	 0	 5	 3	 0	 0	 1	 0	 0	 0	 0	
3	 0	 0	 0	 2	 0	 0	 1	 0	 1	 0	 0	 0	
Similarity
=xi
Txq
= xi[j] xq[j]
= 13
dX
j=1
Machine	Learning	Specializa0on	37	
Another natural inner product measure
©2016	Emily	Fox	&	Carlos	Guestrin		
1	 0	 0	 0	 5	 3	 0	 0	 1	 0	 0	 0	 0	
0	 0	 1	 0	 0	 0	 9	 0	 0	 6	 0	 4	 0	
Similarity
= 0
Machine	Learning	Specializa0on	38	
Cosine similarity – normalize
©2016	Emily	Fox	&	Carlos	Guestrin		
Similarity = xi[j] xq[j]
= xi
T xq = cos(θ)
dX
j=1
(xi[j])2 (xq[j])2
dX
j=1
dX
j=1
||xi|| ||xq||
Feature 1
Feature2
-  Not a proper
distance
metric
-  Efficient to
compute for
sparse vecs
p p
Machine	Learning	Specializa0on	39	
Normalize
©2016	Emily	Fox	&	Carlos	Guestrin		
1	 0	 0	 0	 5	 3	 0	 0	 1	 0	 0	 0	 0	
√(12 + 52 + 32 + 12)
1
/
6	
0	 0	 0	
5
/
6	
3
/
6	
0	 0	
1
/
6	
0	 0	 0	 0
Machine	Learning	Specializa0on	40	
Cosine similarity
In general, < similarity <
For positive features (like tf-idf)
< similarity <
Define distance = 1-similarity
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	41	
To normalize or not?
©2016	Emily	Fox	&	Carlos	Guestrin		
10	2	 0	 0	 0	 6	 0	 0	 2	 0	 0	 0	 0	1	 0	 0	 0	 5	 3	 0	 0	 1	 0	 0	 0	 0	
3	 1	 0	 0	 2	 0	 0	 1	 0	 1	 0	 0	 0	 6	 2	 0	 0	 4	 0	 0	 2	 0	 2	 0	 0	 0	
Similarity = 13 Similarity = 52
Machine	Learning	Specializa0on	42	
Normalize
©2016	Emily	Fox	&	Carlos	Guestrin		
1	 0	 0	 0	 5	 3	 0	 0	 1	 0	 0	 0	 0	
√(12 + 52 + 32 + 12)
1
/
6	
0	 0	 0	
5
/
6	
3
/
6	
0	 0	
1
/
6	
0	 0	 0	 0
Machine	Learning	Specializa0on	43	
In the normalized case
©2016	Emily	Fox	&	Carlos	Guestrin		
Similarity
= 13/24
1
/
6	
0	 0	 0	
5
/
6	
3
/
6	
0	 0	
1
/
6	
0	 0	 0	 0	
1
/
6	
0	 0	 0	
5
/
6	
3
/
6	
0	 0	
1
/
6	
0	 0	 0	 0	
3
/
4	
1
/
4	
0	 0	
1
/
2	
0	 0	
1
/
4	
0	
1
/
4	
0	 0	 0	
3
/
4	
1
/
4	
0	 0	
1
/
2	
0	 0	
1
/
4	
0	
1
/
4	
0	 0	 0	
Similarity
= 13/24
Machine	Learning	Specializa0on	44	
But not always desired…
©2016	Emily	Fox	&	Carlos	Guestrin		
long document
short tweet
long document long document
Normalizing can
make dissimilar
objects appear
more similar
Common
compromise:
Just cap maximum
word counts
Machine	Learning	Specializa0on	45	
Other distance metrics
- Mahalanobis
- rank-based
- correlation-based
- Manhattan
- Jaccard
- Hamming
- …
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	46	
Combining distance metrics
Example of document features:
1.  Text of document
-  Distance metric: Cosine similarity
2.  # of reads of doc
-  Distance metric: Euclidean distance
©2016	Emily	Fox	&	Carlos	Guestrin		
Add together with user-specified weights
Machine	Learning	Specializa0on	
Scaling up k-NN search by
storing data in a KD-tree
©2016	Emily	Fox	&	Carlos	Guestrin		47
Machine	Learning	Specializa0on	
Complexity of search
48 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	49	
Complexity of brute-force search
Given a query point, scan through each point
- O(N) distance computations per 1-NN query!
- O(Nlogk) per k-NN query!
©2016	Emily	Fox	&	Carlos	Guestrin		
What if N is huge???
(and many queries)
Machine	Learning	Specializa0on	
KD-tree representation
50 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	51	
Structured organization of documents
-  Recursively partitions points
into axis aligned boxes.
Enables more efficient
pruning of search space
Works “well” in “low-medium” dimensions
-  We’ll get back to this…
KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	52	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
Start with a list of
d-dimensional points.
Pt x[1] x[2]
1 0.00 0.00
2 1.00 4.31
3 0.13 2.85
… … …
Machine	Learning	Specializa0on	53	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
x[1]
>.5	
Split points into 2 groups
Split dimension
Split value
YESNO
Pt x[1] x[2]
1 0.00 0.00
3 0.13 2.85
… … …
Pt x[1] x[2]
2 1.00 4.31
… … …
Machine	Learning	Specializa0on	54	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
x[1]
>.5	
Recurse on each group
separately
Split dimension
Split value
YESNO
Pt x[1] x[2]
1 0.00 0.00
3 0.13 2.85
… … …
Pt x[1] x[2]
2 1.00 4.31
… … …
Machine	Learning	Specializa0on	55	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
x[1]
>.5	
Recurse on each group
separately
Split dim 1
Split value 2
x[2]
>.1	
Split dim 2
Split value 2
YESNO
NO
Pt x[1] x[2]
2 1.00 4.31
… … …
YES
Pt x[1] x[2]
1 0.00 0.00
… … …
Pt x[1] x[2]
3 0.13 2.85
… … …
Machine	Learning	Specializa0on	56	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
Continue splitting points at each set
-  Creates a binary tree structure
Each leaf node contains a list of points
Machine	Learning	Specializa0on	57	
KD-tree construction
©2016	Emily	Fox	&	Carlos	Guestrin		
Keep one additional piece of info at each node:
-  The (tight) bounds of points at or below node
Machine	Learning	Specializa0on	58	
KD-tree construction choices
Use heuristics to make splitting decisions:
- Which dimension do we split along?
- Which value do we split at?
- When do we stop?
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	59	
Many heuristics…
©2016	Emily	Fox	&	Carlos	Guestrin		
median heuristic center-of-range
heuristic
Machine	Learning	Specializa0on	
NN search with KD-trees
60 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	61	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
Traverse tree looking for nearest neighbor to query point
Machine	Learning	Specializa0on	62	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
Machine	Learning	Specializa0on	63	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
Machine	Learning	Specializa0on	64	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
Machine	Learning	Specializa0on	65	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
2. Compute distance to each other point at leaf node
Machine	Learning	Specializa0on	66	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
2. Compute distance to each other point at leaf node
Does nearest neighbor have to live at
leaf node containing query point?
Machine	Learning	Specializa0on	67	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
2. Compute distance to each other point at leaf node
3. Backtrack and try other branch at each node visited
Machine	Learning	Specializa0on	68	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
1. Start by exploring leaf node containing query point
2. Compute distance to each other point at leaf node
3. Backtrack and try other branch at each node visited
Update distance bound when new
nearest neighbor is found
Machine	Learning	Specializa0on	69	 ©2016	Emily	Fox	&	Carlos	Guestrin		
Nearest neighbor with KD-trees
Use distance bound and bounding box of each node to
prune parts of tree that cannot include nearest neighbor
Machine	Learning	Specializa0on	70	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
Use distance bound and bounding box of each node to
prune parts of tree that cannot include nearest neighbor
Machine	Learning	Specializa0on	71	
Nearest neighbor with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
Use distance bound and bounding box of each node to
prune parts of tree that cannot include nearest neighbor
Machine	Learning	Specializa0on	72	
Complexity
For (nearly) balanced, binary trees...
•  Construction
-  Size:
-  Depth:
-  Median + send points left right:
-  Construction time:
•  1-NN query
-  Traverse down tree to starting point:
-  Maximum backtrack and traverse:
-  Complexity range:
Under some assumptions on distribution of points,
we get O(logN) but exponential in d
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	73	
Complexity
©2016	Emily	Fox	&	Carlos	Guestrin		
pruned many
(closer to O(log N) )
pruned few
(closer to O(N) )
Machine	Learning	Specializa0on	74	
Complexity for N queries
•  Ask for nearest neighbor to each doc
•  Brute force 1-NN:
•  kd-trees:
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	75	
Inspections vs. N and d
©2016	Emily	Fox	&	Carlos	Guestrin		
0 2000 4000 6000 8000 1000
0
10
20
30
40
50
60
70
80
1 3 5 7 9 11 13 15
0
100
200
300
400
500
600
#inspections
N
#inspections
d
log(N) trend
exp(d) trend
Machine	Learning	Specializa0on	76	
k-NN with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
distance to 2nd nearest neighbor
in 2-NN example
Exactly same algorithm, but maintain distance to
furthest of current k nearest neighbors
Machine	Learning	Specializa0on	
Approximate k-NN search
77 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	78	
Approximate k-NN with KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin		
Before: Prune when distance to bounding box > r
Now: Prune when distance to bounding box > r/α
Prunes more than allowed, but can guarantee that if we return a
neighbor at distance r, then there is no neighbor closer than r/α
Saves lots of search
time at little cost in
quality of NN!
Bound loose…In practice, often closer to optimal.
Machine	Learning	Specializa0on	79	
Closing remarks on KD-trees
Tons of variants of kd-trees
-  On construction of trees
(heuristics for splitting, stopping, representing branches…)
-  Other representational data structures for fast NN search
(e.g., ball trees,…)
Nearest Neighbor Search
-  Distance metric and data representation crucial to answer returned
For both, high-dim spaces are hard!
-  Number of kd-tree searches can be exponential in dimension
•  Rule of thumb… N >> 2d… Typically useless for large d.
-  Distances sensitive to irrelevant features
•  Most dimensions are just noise à everything is far away
•  Need technique to learn which features are important to given task
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	80	
Acknowledgements
Thanks to Andrew Moore
(http://www.cs.cmu.edu/~awm/)
for the KD-trees slide outline
©2016	Emily	Fox	&	Carlos	Guestrin		80
Machine	Learning	Specializa0on	
Locality sensitive hashing
for approximate NN search
©2016	Emily	Fox	&	Carlos	Guestrin		81
Machine	Learning	Specializa0on	82	
Motivating alternative approaches to
approximate NN search
•  KD-trees are cool, but…
- Non-trivial to implement efficiently
- Problems with high-dimensional data
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	83	
KD-trees in high dimensions
©2016	Emily	Fox	&	Carlos	Guestrin		
•  Unlikely to have any data points
close to query point
•  Once “nearby” point is found,
the search radius is likely to
intersect many hypercubes
in at least one dim
•  Not many nodes can be pruned
•  Can show under some conditions
that you visit at least 2d nodes
Machine	Learning	Specializa0on	84	
Moving away from exact NN search
•  Approximate neighbor finding…
- Don’t find exact neighbor, but that’s okay for
many applications
•  Focus on methods that provide good
probabilistic guarantees on approximation
©2016	Emily	Fox	&	Carlos	Guestrin		
Out of millions of articles, do we need the closest
article or just one that’s pretty similar?
Do we even fully trust our measure of similarity???
Machine	Learning	Specializa0on	
LSH as an alternative to KD-trees
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	86	
Simple “binning” of data into 2 bins
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
Score(x) = 1.0 #awesome – 1.5 #awful
Score(x) > 0
Score(x) < 0
Like a decision boundary
in classification
Machine	Learning	Specializa0on	87	
2D Data Sign(Score) Bin index
x1 = [0, 5] -1 0
x2 = [1, 3] -1 0
x3 = [3, 0] 1 1
… … …
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
Simple “binning” of data into 2 bins
Sign(Score(x)) = +1
Sign(Score(x)) = -1
Machine	Learning	Specializa0on	88	 ©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
Using bins for NN search
Only search here for
queries with Score(x)>0
Only search here for
queries with Score(x)<0
candidate
neighbors if
Score(x)<0
Query point x
2D Data Sign(Score) Bin index
x1 = [0, 5] -1 0
x2 = [1, 3] -1 0
x3 = [3, 0] 1 1
… … …
Machine	Learning	Specializa0on	89	 ©2016	Emily	Fox	&	Carlos	Guestrin		
Using score for NN search
Bin 0 1
List containing
indices of datapoints:
{1,2,4,7,…} {3,5,6,8,…}
search for NN
amongst this set
HASH
TABLE
candidate
neighbors if
Score(x)<0
2D Data Sign(Score) Bin index
x1 = [0, 5] -1 0
x2 = [1, 3] -1 0
x3 = [3, 0] 1 1
… … …
Machine	Learning	Specializa0on	90	 ©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
Provides approximate NN
Nearest neighbor to
query point found? NO
Machine	Learning	Specializa0on	
A practical implementation
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	92	
Three potential issues with simple approach
1. Challenging to find good line
2. Poor quality solution:
- Points close together get split into separate bins
3. Large computational cost:
- Bins might contain many points, so still
searching over large set for each NN query
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	93	 ©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
How to define the line?
Crazy idea:
Define line randomly!
Machine	Learning	Specializa0on	94	
How bad can a random line be?
©2016	Emily	Fox	&	Carlos	Guestrin		
Goal: If x,y are close (according to cosine similarity),
want binned values to be the same.
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
θxy
Both points
in bin 0
x
y
Machine	Learning	Specializa0on	95	
How bad can a random line be?
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
θxy
Both points
in bin 1
Goal: If x,y are close (according to cosine similarity),
want binned values to be the same.
x
y
Machine	Learning	Specializa0on	96	
How bad can a random line be?
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
θxy
One point in
bin 0 and
other in bin 1
Goal: If x,y are close (according to cosine similarity),
want binned values to be the same.
x
y
Machine	Learning	Specializa0on	97	
How bad can a random line be?
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
θxy
Bins are
the same
Bins are
the same
Bins are
different
If θxy is small (x,y close),
unlikely to be placed
into separate bins
Goal: If x,y are close (according to cosine similarity),
want binned values to be the same.
x
y
Machine	Learning	Specializa0on	98	
Three potential issues with simple approach
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin 0 1
List containing
indices of datapoints:
{1,2,4,7,…} {3,5,6,8,…}
1. Challenging to find good line
2. Poor quality solution:
- Points close together get split into separate bins
3. Large computational cost:
- Bins might contain many points, so still
searching over large set for each NN query
Machine	Learning	Specializa0on	
Improving efficiency:
Reducing # points examined per query
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	100	
Reducing search cost through more bins
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
…
Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Machine	Learning	Specializa0on	101	
Using score for NN search
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
Data
indices:
{1,2} -- {4,8,11} -- -- -- {7,9,10} {3,5,6}
search for NN
amongst this set
2D Data Sign
(Score1)
Bin 1
index
Sign
(Score2)
Bin 2
index
Sign
(Score3)
Bin 3
index
x1 = [0, 5] -1 0 -1 0 -1 0
x2 = [1, 3] -1 0 -1 0 -1 0
x3 = [3, 0] 1 1 1 1 1 1
… … … … … … …
Machine	Learning	Specializa0on	102	
Improving search quality by
searching neighboring bins
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
Data
indices:
{1,2} -- {4,8,11} -- -- -- {7,9,10} {3,5,6}
Query point here,
but is NN?
Not necessarily
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Even worse than before…Each line can split pts.
Sacrificing accuracy for speed
Machine	Learning	Specializa0on	103	
Improving search quality by
searching neighboring bins
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
Data
indices:
{1,2} -- {4,8,11} -- -- -- {7,9,10} {3,5,6}
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Next closest
bins
(flip 1 bit)
Machine	Learning	Specializa0on	104	
Improving search quality by
searching neighboring bins
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
Data
indices:
{1,2} -- {4,8,11} -- -- -- {7,9,10} {3,5,6}
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Further bin
(flip 2 bits)
Machine	Learning	Specializa0on	105	
Improving search quality by
searching neighboring bins
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
Data
indices:
{1,2} -- {4,8,11} -- -- -- {7,9,10} {3,5,6}
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Quality of retrieved NN can only
improve with searching more bins
Algorithm:
Continue searching until
computational budget is reached
or quality of NN good enough
Machine	Learning	Specializa0on	106	
LSH recap
•  Draw h random lines
•  Compute “score” for each point
under each line and translate to
binary index
•  Use h-bit binary vector per data
point as bin index
•  Create hash table
©2016	Emily	Fox	&	Carlos	Guestrin		
kd-treecompetitor
datastructure
•  For each query point x, search bin(x),
then neighboring bins until time limit
Machine	Learning	Specializa0on	
Moving to higher dimensions d
©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	108	
Draw random planes
©2016	Emily	Fox	&	Carlos	Guestrin		
#awesome
#awful
x[1]
x[3]
Score(x) = v1 #awesome
+ v2 #awful
+ v3 #great
x[2]
Machine	Learning	Specializa0on	109	
Cost of binning points in d-dim
©2016	Emily	Fox	&	Carlos	Guestrin		
Score(x) = v1 #awesome
+ v2 #awful
+ v3 #great
Per data point,
need d multiplies
to determine bin
index per plane
One-time cost offset if many
queries of fixed dataset
Machine	Learning	Specializa0on	
Using multiple tables for even
greater efficiency in NN search
©2016	Emily	Fox	&	Carlos	Guestrin		
OPTIONAL
Machine	Learning	Specializa0on	111	
If I throw down 2 lines…
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
…
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
For simplicity, assume we search
bins 1 bit off from query
Let δ be the probability of a line
falling between points θ apart
Search 3 bins and do not
find NN with probability δ2
Machine	Learning	Specializa0on	112	
What if I repeat the 2-line binning?
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]
Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0]
Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Machine	Learning	Specializa0on	113	
What if I repeat the 2-line binning?
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Now, search
only query bin
per table
Still searching 3
bins, but what is
chance of not
finding NN?
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]
Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0]
Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Machine	Learning	Specializa0on	114	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
What if I repeat the 2-line binning?
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
What is chance
that query pt
and NN are split
in all tables?
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]
Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0]
Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Machine	Learning	Specializa0on	115	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Probability of splitting
neighboring points many times
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]
Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0]
Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Probability NN is
in different bin:
Prob = 1-Pr(same bin)
= 1-(1-δ)2
= 2δ - δ2
Machine	Learning	Specializa0on	116	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Probability of splitting
neighboring points many times
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0] Bin index:
[0 1]
Bin index:
[1 1]
Bin index:
[1 0]
Line 1
Line 2
Bin index:
[0 0]
Bin index:
[0 1]
Bin index:
[1 1]Bin index:
[1 0]
Probability NN is
in different bin in
all 3 tables:
Prob = (2δ - δ2)3
Machine	Learning	Specializa0on	117	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k k
Comparing approaches
for 2-bit tables
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k kBin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
# bins
searched
prob. of
no NN
3
3 (2δ - δ2)3
δ2
Machine	Learning	Specializa0on	118	
Comparing probabilities
©2016	Emily	Fox	&	Carlos	Guestrin		
(2δ - δ2)3
δ2
Machine	Learning	Specializa0on	119	
If I throw down h lines…
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Still assume we search bins 1 bit off from query
Prob. of being > 1 bit away
= 1-Pr(same bin)-Pr(1 bin away)
= 1-Pr(no split lines)-Pr(1 split line)
= 1-(1-δ)h-hδ(1-δ)h-1
Machine	Learning	Specializa0on	120	
If I throw down h lines…
©2016	Emily	Fox	&	Carlos	Guestrin		
Search h+1 bins and do not find NN
with probability 1-(1-δ)h-hδ(1-δ)h-1
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
#awesome
#awful
0 1 2 3 4 …
0
1
2
3
4
… Line 1
Line 2
Line 3
Bin index:
[0 0 0]
Bin index:
[0 1 0]
Bin index:
[1 1 0]
Bin index:
[1 1 1]
Still assume we search bins 1 bit off from query
Prob. of being > 1 bit away
= 1-Pr(same bin)-Pr(1 bin away)
= 1-Pr(no split lines)-Pr(1 split line)
= 1-(1-δ)h-hδ(1-δ)h-1
Machine	Learning	Specializa0on	121	
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Probability of splitting
neighboring points many times
©2016	Emily	Fox	&	Carlos	Guestrin		
Probability NN is
in different bin in
all h+1 tables
= (1-Pr(same bin))h+1
= (1-Pr(no split line))h+1
= (1-(1-δ)h)h+1
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Machine	Learning	Specializa0on	122	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k k
Comparing approaches
for h-bit tables
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k kBin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
# bins
searched
prob. of
no NN
h+1
(1-(1-δ)h)h+1
1-(1-δ)h-hδ(1-δ)h-1
h+1
Machine	Learning	Specializa0on	123	
Comparing probabilities
©2016	Emily	Fox	&	Carlos	Guestrin		
(1-(1-δ)h)h+1
1-(1-δ)h-hδ(1-δ)h-1
h = 3
h = 10
Machine	Learning	Specializa0on	124	
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Fix #bits and increase depth
©2016	Emily	Fox	&	Carlos	Guestrin		
Probability NN is
in different bin in
all tables falls off
exponentially fast
Prob
= (1-Pr(same bin))m+1
= (1-Pr(no split line))m+1
= (1-(1-δ)h)m+1
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Machine	Learning	Specializa0on	125	
Bin [0 0 0]
= 0
[0 0 1]
= 1
[0 1 0]
= 2
[0 1 1]
= 3
[1 0 0]
= 4
[1 0 1]
= 5
[1 1 0]
= 6
[1 1 1]
= 7
indices L0 L1 L2 L3 L4 L5 L6 L7
Fix #bits and increase depth
©2016	Emily	Fox	&	Carlos	Guestrin		
Typically higher
probability of
finding NN than
searching m bins
in 1 table
Machine	Learning	Specializa0on	126	
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k k
Summary of LSH approaches
©2016	Emily	Fox	&	Carlos	Guestrin		
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Bin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
# pts k k k kBin [0 0] =
0
[0 1] =
1
[1 0] =
2
[1 1] =
3
indices L0 L1 L2 L3
Cost of binning points is lower,
but likely need to search
more bins per query
Cost of binning points is higher,
but likely need to search
fewer bins per query
Machine	Learning	Specializa0on	
Summary for retrieval using
nearest neighbors, KD-trees,
and locality sensitive hashing
127 ©2016	Emily	Fox	&	Carlos	Guestrin
Machine	Learning	Specializa0on	128	
What you can do now…
•  Implement nearest neighbor search for retrieval tasks
•  Contrast document representations (e.g., raw word
counts, tf-idf,…)
-  Emphasize important words using tf-idf
•  Contrast methods for measuring similarity between two
documents
-  Euclidean vs. weighted Euclidean
-  Cosine similarity vs. similarity via unnormalized inner product
•  Describe complexity of brute force search
•  Implement KD-trees for nearest neighbor search
•  Implement LSH for approximate nearest neighbor search
•  Compare pros and cons of KD-trees and LSH, and decide
which is more appropriate for given dataset
©2016	Emily	Fox	&	Carlos	Guestrin

More Related Content

Similar to C4.1.2

C2.6
C2.6C2.6
C4.3.1
C4.3.1C4.3.1
C4.3.1
Daniel LIAO
 
C2.2
C2.2C2.2
C3.2.1
C3.2.1C3.2.1
C3.2.1
Daniel LIAO
 
3.1 Functions
3.1 Functions3.1 Functions
3.1 Functions
smiller5
 
C4.4
C4.4C4.4
Entity-Relationship Queries over Wikipedia
Entity-Relationship Queries over WikipediaEntity-Relationship Queries over Wikipedia
Bba 3274 qm week 9 transportation and assignment models
Bba 3274 qm week 9 transportation and assignment modelsBba 3274 qm week 9 transportation and assignment models
Bba 3274 qm week 9 transportation and assignment models
Stephen Ong
 
C2.5
C2.5C2.5
C2.4
C2.4C2.4

Similar to C4.1.2 (10)

C2.6
C2.6C2.6
C2.6
 
C4.3.1
C4.3.1C4.3.1
C4.3.1
 
C2.2
C2.2C2.2
C2.2
 
C3.2.1
C3.2.1C3.2.1
C3.2.1
 
3.1 Functions
3.1 Functions3.1 Functions
3.1 Functions
 
C4.4
C4.4C4.4
C4.4
 
Entity-Relationship Queries over Wikipedia
Entity-Relationship Queries over WikipediaEntity-Relationship Queries over Wikipedia
Entity-Relationship Queries over Wikipedia
 
Bba 3274 qm week 9 transportation and assignment models
Bba 3274 qm week 9 transportation and assignment modelsBba 3274 qm week 9 transportation and assignment models
Bba 3274 qm week 9 transportation and assignment models
 
C2.5
C2.5C2.5
C2.5
 
C2.4
C2.4C2.4
C2.4
 

More from Daniel LIAO

C4.5
C4.5C4.5
Lime
LimeLime
C4.1.1
C4.1.1C4.1.1
C4.1.1
Daniel LIAO
 
C3.7.1
C3.7.1C3.7.1
C3.7.1
Daniel LIAO
 
C3.6.1
C3.6.1C3.6.1
C3.6.1
Daniel LIAO
 
C3.5.1
C3.5.1C3.5.1
C3.5.1
Daniel LIAO
 
C3.4.2
C3.4.2C3.4.2
C3.4.2
Daniel LIAO
 
C3.4.1
C3.4.1C3.4.1
C3.4.1
Daniel LIAO
 
C3.3.1
C3.3.1C3.3.1
C3.3.1
Daniel LIAO
 
C2.3
C2.3C2.3
C3.1.2
C3.1.2C3.1.2
C3.1.2
Daniel LIAO
 
C3.2.2
C3.2.2C3.2.2
C3.2.2
Daniel LIAO
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
Daniel LIAO
 
C3.1 intro
C3.1 introC3.1 intro
C3.1 intro
Daniel LIAO
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
Daniel LIAO
 

More from Daniel LIAO (15)

C4.5
C4.5C4.5
C4.5
 
Lime
LimeLime
Lime
 
C4.1.1
C4.1.1C4.1.1
C4.1.1
 
C3.7.1
C3.7.1C3.7.1
C3.7.1
 
C3.6.1
C3.6.1C3.6.1
C3.6.1
 
C3.5.1
C3.5.1C3.5.1
C3.5.1
 
C3.4.2
C3.4.2C3.4.2
C3.4.2
 
C3.4.1
C3.4.1C3.4.1
C3.4.1
 
C3.3.1
C3.3.1C3.3.1
C3.3.1
 
C2.3
C2.3C2.3
C2.3
 
C3.1.2
C3.1.2C3.1.2
C3.1.2
 
C3.2.2
C3.2.2C3.2.2
C3.2.2
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
 
C3.1 intro
C3.1 introC3.1 intro
C3.1 intro
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
 

Recently uploaded

CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
blueshagoo1
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
shreyassri1208
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
deepaannamalai16
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
Celine George
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Kalna College
 
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
andagarcia212
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
Kalna College
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
Iris Thiele Isip-Tan
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
Kalna College
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
Kalna College
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 
Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10
nitinpv4ai
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
heathfieldcps1
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
Kalna College
 

Recently uploaded (20)

CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
 
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
欧洲杯下注-欧洲杯下注押注官网-欧洲杯下注押注网站|【​网址​🎉ac44.net🎉​】
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 
Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10Haunted Houses by H W Longfellow for class 10
Haunted Houses by H W Longfellow for class 10
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
 

C4.1.2