SlideShare a Scribd company logo
It Probably Works
Tyler McMullen
CTO of Fastly
@tbmcmullen
Fastly
We’re an awesome CDN.
What is a probabilistic
algorithm?
Why bother?
–Hal Abelson and Gerald J. Sussman, SICP
“In testing primality of very large numbers chosen at
random, the chance of stumbling upon a value that
fools the Fermat test is less than the chance that
cosmic radiation will cause the computer to make
an error in carrying out a 'correct' algorithm.
Considering an algorithm to be inadequate for the
first reason but not for the second illustrates the
difference between mathematics and engineering.”
Everything is
probabilistic
Probabilistic algorithms
are not “guessing”
Provably bounded
error rates
Don’t use them if
someone’s life depends
on it.
Don’t use them if
someone’s life depends
on it.
When should I use
these?
Ok, now what?
The Count-distinct
Problem
The Problem
How many unique words are in a large
corpus of text?
The Problem
How many different users visited a
popular website in a day?
The Problem
How many unique IPs have connected to a
server over the last hour?
The Problem
How many unique URLs have been requested
through an HTTP proxy?
The Problem
How many unique URLs have been requested
through an entire network of HTTP proxies?
166.208.249.236	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /product/982"	
  200	
  20246	
  
103.138.203.165	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /article/490"	
  200	
  29870	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /page/1"	
  200	
  20409	
  
150.255.232.154	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/1"	
  200	
  42999	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /article/490"	
  200	
  25080	
  
150.255.232.154	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /listing/567"	
  200	
  33617	
  
103.138.203.165	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /listing/567"	
  200	
  29618	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /page/1"	
  200	
  30265	
  
166.208.249.236	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/1"	
  200	
  5683	
  
244.210.202.222	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /article/490"	
  200	
  47124	
  
103.138.203.165	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /listing/567"	
  200	
  48734	
  
103.138.203.165	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /listing/567"	
  200	
  27392	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /listing/567"	
  200	
  15705	
  
150.255.232.154	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/1"	
  200	
  22587	
  
244.210.202.222	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /product/982"	
  200	
  30063	
  
244.210.202.222	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/1"	
  200	
  6041	
  
166.208.249.236	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /product/982"	
  200	
  25783	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /article/490"	
  200	
  1099	
  
244.210.202.222	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /product/982"	
  200	
  31494	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /listing/567"	
  200	
  30389	
  
150.255.232.154	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /article/490"	
  200	
  10251	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /product/982"	
  200	
  19384	
  
150.255.232.154	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /product/982"	
  200	
  24062	
  
244.210.202.222	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /article/490"	
  200	
  19070	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/648"	
  200	
  45159	
  
191.141.247.227	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "HEAD	
  /page/648"	
  200	
  5576	
  
166.208.249.236	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /page/648"	
  200	
  41869	
  
166.208.249.236	
  -­‐	
  -­‐	
  [25/Feb/2015:07:20:13	
  +0000]	
  "GET	
  /listing/567"	
  200	
  42414	
  
def	
  count_distinct(stream):	
  
	
  	
  	
  	
  seen	
  =	
  set()	
  
	
  	
  	
  	
  for	
  item	
  in	
  stream:	
  
	
  	
  	
  	
  	
  	
  	
  	
  seen.add(item)	
  
	
  	
  	
  	
  return	
  len(seen)
Scale.
Count-distinct across
thousands of servers.
def	
  combined_cardinality(seen_sets):	
  
	
  	
  	
  	
  combined	
  =	
  set()	
  
	
  	
  	
  	
  for	
  seen	
  in	
  seen_sets:	
  
	
  	
  	
  	
  	
  	
  	
  	
  combined	
  |=	
  seen	
  
	
  	
  	
  	
  return	
  len(combined)
The set grows linearly.
Precision comes at a cost.
“example.com/user/page/1”
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
“example.com/user/page/1”
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set)	
  =	
  0.5
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set	
  
&	
  bit	
  1	
  is	
  set)	
  =	
  0.25
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set	
  	
  
&	
  bit	
  1	
  is	
  set	
  
&	
  bit	
  2	
  is	
  set)	
  =	
  0.125
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set)	
  =	
  0.5	
  
Expected	
  trials	
  =	
  2
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set	
  
&	
  bit	
  1	
  is	
  set)	
  =	
  0.25	
  
Expected	
  trials	
  =	
  4
1	
  0	
  1	
  1	
  0	
  1	
  0	
  1
P(bit	
  0	
  is	
  set	
  	
  
&	
  bit	
  1	
  is	
  set	
  
&	
  bit	
  2	
  is	
  set)	
  =	
  0.125	
  
Expected	
  trials	
  =	
  8
We expect the maximum number of leading
zeros we have seen + 1 to approximate
log2(unique items).
Improve the accuracy of
the estimate by
partitioning the input data.
class	
  LogLog(object):	
  
	
  	
  	
  	
  def	
  __init__(self,	
  k):	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.k	
  =	
  k	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.m	
  =	
  2	
  **	
  k	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.M	
  =	
  np.zeros(self.m,	
  dtype=np.int)	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.alpha	
  =	
  Alpha[k]	
  
	
  	
  
	
  	
  	
  	
  def	
  insert(self,	
  token):	
  
	
  	
  	
  	
  	
  	
  	
  	
  y	
  =	
  hash_fn(token)	
  
	
  	
  	
  	
  	
  	
  	
  	
  j	
  =	
  y	
  >>	
  (hash_len	
  -­‐	
  self.k)	
  
	
  	
  	
  	
  	
  	
  	
  	
  remaining	
  =	
  y	
  &	
  ((1	
  <<	
  (hash_len	
  -­‐	
  self.k))	
  -­‐	
  1)	
  
	
  	
  	
  	
  	
  	
  	
  	
  first_set_bit	
  =	
  (64	
  -­‐	
  self.k)	
  -­‐	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  int(math.log(remaining,	
  2))	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.M[j]	
  =	
  max(self.M[j],	
  first_set_bit)	
  
	
  	
  
	
  	
  	
  	
  def	
  cardinality(self):	
  
	
  	
  	
  	
  	
  	
  	
  	
  return	
  self.alpha	
  *	
  2	
  **	
  np.mean(self.M)
Unions of
HyperLogLogs
HyperLogLog
• Adding an item: O(1)
• Retrieving cardinality: O(1)
• Space: O(log log n)
• Error rate: 2%
HyperLogLog
For 100 million unique items,
and an error rate of 2%,
the size of the HyperLogLog is...
1,500 bytes
Reliable Broadcast
The Problem
Reliably broadcast “purge” messages across the world
as quickly as possible.
Single source of
truth
Single source of
failures
Atomic broadcast
Reliable broadcast
Gossip Protocols
“Designed for Scale”
Probabilistic Guarantees
Bimodal Multicast
• Quickly broadcast message to all servers
• Gossip to recover lost messages
One Problem
Computers have limited space
Throw away messages
“with high probability” is fine
Real World
End-to-End Latency
74ms
83ms
133ms
London
San Jose
Tokyo
0.00
0.00
0.05
0.10
0.00
0.05
0.10
0.00
0.05
0.10
0 50 100 150
Latency (ms)
Density
End-to-End Latency
42ms
74ms
83ms
133ms
New York
London
San Jose
Tokyo
0.00
0.05
0.10
0.00
0.05
0.10
0.00
0.05
0.10
0.00
0.05
0.10
0 50 100 150
Latency (ms)
Density
Density plot and 95th
percentile of purge latency by server location
Packet Loss
Good systems are boring
What was the point again?
We can build things that
are otherwise
unrealistic
We can build systems
that are more
reliable
You’re already using them.
We’re hiring!
Thanks
@tbmcmullen
What even is this?
Probabilistic Algorithms
Randomized Algorithms
Estimation Algorithms
Probabilistic Algorithms
1. An iota of theory
2. Where are they useful and where are they not?
3. HyperLogLog
4. Locality-sensitive Hashing
5. Bimodal Multicast
“An algorithm that uses randomness to improve
its efficiency”
Las Vegas
Monte Carlo
Las Vegas
def	
  find_las_vegas(haystack,	
  needle):	
  
	
  	
  	
  	
  length	
  =	
  len(haystack)	
  
	
  	
  	
  	
  while	
  True:	
  
	
  	
  	
  	
  	
  	
  	
  	
  index	
  =	
  randrange(length)	
  
	
  	
  	
  	
  	
  	
  	
  	
  if	
  haystack[index]	
  ==	
  needle:	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  return	
  index
Monte Carlo
def	
  find_monte_carlo(haystack,	
  needle,	
  k):	
  
	
  	
  	
  	
  length	
  =	
  len(haystack)	
  
	
  	
  	
  	
  for	
  i	
  in	
  range(k):	
  
	
  	
  	
  	
  	
  	
  	
  	
  index	
  =	
  randrange(length)	
  
	
  	
  	
  	
  	
  	
  	
  	
  if	
  haystack[index]	
  ==	
  needle:	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  return	
  index
– Prabhakar Raghavan (author of Randomized Algorithms)
“For many problems a randomized
algorithm is the simplest the
fastest or both.”
Naive Solution
For 100 million unique IPv4 addresses,
the size of the hash is...
>400mb
Slightly Less Naive
Add each IP to a bloom filter and keep a
counter of the IPs that don’t collide.
Slightly Less Naive
ips_seen	
  =	
  BloomFilter(capacity=expected_size,	
  error_rate=0.03)	
  
counter	
  =	
  0	
  
	
  	
  
for	
  line	
  in	
  log_file:	
  
	
  	
  	
  	
  ip	
  =	
  extract_ip(line)	
  
	
  	
  	
  	
  if	
  items_bloom.add(ip):	
  
	
  	
  	
  	
  	
  	
  	
  	
  counter	
  +=	
  1	
  
	
  	
  
print	
  "Unique	
  IPs:",	
  counter
Slightly Less Naive
• Adding an IP: O(1)
• Retrieving cardinality: O(1)
• Space: O(n)
• Error rate: 3%
kind of
Slightly Less Naive
For 100 million unique IPv4 addresses,
and an error rate of 3%,
the size of the bloom filter is...
87mb
def	
  insert(self,	
  token):	
  
	
  	
  	
  	
  #	
  Get	
  hash	
  of	
  token	
  
	
  	
  	
  	
  y	
  =	
  hash_fn(token)	
  
	
  	
  
	
  	
  	
  	
  #	
  Extract	
  `k`	
  most	
  significant	
  bits	
  of	
  `y`	
  
	
  	
  	
  	
  j	
  =	
  y	
  >>	
  (hash_len	
  -­‐	
  self.k)	
  
	
  	
  
	
  	
  	
  	
  #	
  Extract	
  remaining	
  bits	
  of	
  `y`	
  
	
  	
  	
  	
  remaining	
  =	
  y	
  &	
  ((1	
  <<	
  (hash_len	
  -­‐	
  self.k))	
  -­‐	
  1)	
  
	
  	
  
	
  	
  	
  	
  #	
  Find	
  "first"	
  set	
  bit	
  of	
  `remaining`	
  
	
  	
  	
  	
  first_set_bit	
  =	
  (64	
  -­‐	
  self.k)	
  -­‐	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  int(math.log(remaining,	
  2))	
  
	
  	
  
	
  	
  	
  	
  #	
  Update	
  `M[j]`	
  to	
  max	
  of	
  `first_set_bit`	
  
	
  	
  	
  	
  #	
  and	
  existing	
  value	
  of	
  `M[j]`	
  
	
  	
  	
  	
  self.M[j]	
  =	
  max(self.M[j],	
  first_set_bit)
def	
  cardinality(self):	
  
	
  	
  	
  	
  #	
  The	
  mean	
  of	
  `M`	
  estimates	
  `log2(n)`	
  with	
  
	
  	
  	
  	
  #	
  an	
  additive	
  bias	
  
	
  	
  	
  	
  return	
  self.alpha	
  *	
  2	
  **	
  np.mean(self.M)
The Problem
Find documents that are similar to one
specific document.
The Problem
Find images that are similar to one
specific image.
The Problem
Find graphs that are correlated to one
specific graph.
The Problem
Nearest neighbor search.
The Problem
“Find the n closest points in a d-dimensional space.”
The Problem
You have a bunch of things and you want to
figure out which ones are similar.
“There has been a lot of recent work on streaming algorithms,
i.e. algorithms that produce an output by making
one pass (or a few passes) over the data while using a
limited amount of storage space and time. To cite a few
examples, ...”
{	
  "there":	
  1,	
  "has":	
  1,	
  "been":	
  	
  1,	
  “a":4,	
  
	
  	
  "lot":	
  	
  	
  1,	
  "of":	
  	
  2,	
  "recent":1,	
  ...	
  }
• Cosine similarity
• Jaccard similarity
• Euclidian distance
• etc etc etc
Euclidian Distance
Metric space
kd-trees
Curse of Dimensionality
Locality-sensitive hashing
Locality-Sensitive Hashing for Finding Nearest Neighbors
- Slaney and Casey
Random Hyperplanes
{	
  "there":	
  1,	
  "has":	
  1,	
  "been":	
  	
  1,	
  “a":4,	
  
	
  	
  "lot":	
  	
  	
  1,	
  "of":	
  	
  2,	
  "recent":1,	
  ...	
  }
LSH
0	
  1	
  1	
  1	
  0	
  1	
  0	
  0	
  0	
  0	
  1	
  0	
  1	
  0	
  1	
  0	
  ...
Cosine Similarity
LSH
Firewall Partition
DDoS
• `

More Related Content

What's hot

Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1
Kyunghoon Kim
 
Time Series Analysis for Network Secruity
Time Series Analysis for Network SecruityTime Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
mrphilroth
 
Ember
EmberEmber
Ember
mrphilroth
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
Albert Bifet
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta Meetup
Sri Ambati
 
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard WorldMonitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Brian Troutwine
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
Albert Bifet
 
Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014
Brian Troutwine
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning Systems
Pooyan Jamshidi
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into value
NAVER D2
 
Weather of the Century: Visualization
Weather of the Century: VisualizationWeather of the Century: Visualization
Weather of the Century: Visualization
MongoDB
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
Albert Bifet
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3
Oliver Zhang
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
Pooyan Jamshidi
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
Albert Bifet
 
The Weather of the Century
The Weather of the CenturyThe Weather of the Century
The Weather of the Century
MongoDB
 
The Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationThe Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: Visualization
MongoDB
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 

What's hot (20)

Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1
 
Time Series Analysis for Network Secruity
Time Series Analysis for Network SecruityTime Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
 
Ember
EmberEmber
Ember
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta Meetup
 
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard WorldMonitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning Systems
 
Logs
LogsLogs
Logs
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into value
 
Weather of the Century: Visualization
Weather of the Century: VisualizationWeather of the Century: Visualization
Weather of the Century: Visualization
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
The Weather of the Century
The Weather of the CenturyThe Weather of the Century
The Weather of the Century
 
The Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationThe Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: Visualization
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 

Viewers also liked

Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
Fastly
 
Applying Varnish
Applying VarnishApplying Varnish
Applying Varnish
Fastly
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
Fastly
 
Confident Refactoring - Ember SF Meetup
Confident Refactoring - Ember SF MeetupConfident Refactoring - Ember SF Meetup
Confident Refactoring - Ember SF Meetup
Fastly
 
Design & Performance - Steve Souders at Fastly Altitude 2015
Design & Performance - Steve Souders at Fastly Altitude 2015Design & Performance - Steve Souders at Fastly Altitude 2015
Design & Performance - Steve Souders at Fastly Altitude 2015
Fastly
 
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
Fastly
 
Fallacy of Fast
Fallacy of FastFallacy of Fast
Fallacy of Fast
Fastly
 
Top 5 Things I've Messed Up in Live Streaming
Top 5 Things I've Messed Up in Live StreamingTop 5 Things I've Messed Up in Live Streaming
Top 5 Things I've Messed Up in Live Streaming
Fastly
 
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
Fastly
 
It Probably Works
It Probably WorksIt Probably Works
It Probably Works
Fastly
 
Developing a Globally Distributed Purging System
Developing a Globally Distributed Purging SystemDeveloping a Globally Distributed Purging System
Developing a Globally Distributed Purging System
Fastly
 
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
Fastly
 
What we can learn from CDNs about Web Development, Deployment, and Performance
What we can learn from CDNs about Web Development, Deployment, and PerformanceWhat we can learn from CDNs about Web Development, Deployment, and Performance
What we can learn from CDNs about Web Development, Deployment, and Performance
Fastly
 
Rails Caching: Secrets From the Edge
Rails Caching: Secrets From the EdgeRails Caching: Secrets From the Edge
Rails Caching: Secrets From the Edge
Fastly
 
Building a better web
Building a better webBuilding a better web
Building a better web
Fastly
 
The future of the edge
The future of the edge The future of the edge
The future of the edge
Fastly
 
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Fastly
 
Making ops life easier
Making ops life easierMaking ops life easier
Making ops life easier
Fastly
 
From Zero to Capacity Planning
From Zero to Capacity PlanningFrom Zero to Capacity Planning
From Zero to Capacity Planning
Fastly
 
Debugging Your CDN - Austin Spires at Fastly Altitude 2015
Debugging Your CDN - Austin Spires at Fastly Altitude 2015Debugging Your CDN - Austin Spires at Fastly Altitude 2015
Debugging Your CDN - Austin Spires at Fastly Altitude 2015
Fastly
 

Viewers also liked (20)

Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
Performance Measuring & Monitoring - Catchpoint CEO Mehdi Daoudi at Fastly Al...
 
Applying Varnish
Applying VarnishApplying Varnish
Applying Varnish
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
 
Confident Refactoring - Ember SF Meetup
Confident Refactoring - Ember SF MeetupConfident Refactoring - Ember SF Meetup
Confident Refactoring - Ember SF Meetup
 
Design & Performance - Steve Souders at Fastly Altitude 2015
Design & Performance - Steve Souders at Fastly Altitude 2015Design & Performance - Steve Souders at Fastly Altitude 2015
Design & Performance - Steve Souders at Fastly Altitude 2015
 
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
The Fallacy of Fast - Ines Sombra at Fastly Altitude 2015
 
Fallacy of Fast
Fallacy of FastFallacy of Fast
Fallacy of Fast
 
Top 5 Things I've Messed Up in Live Streaming
Top 5 Things I've Messed Up in Live StreamingTop 5 Things I've Messed Up in Live Streaming
Top 5 Things I've Messed Up in Live Streaming
 
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
Advanced VCL Workshop - Rogier Mulhuijzen and Stephen Basile at Fastly Altitu...
 
It Probably Works
It Probably WorksIt Probably Works
It Probably Works
 
Developing a Globally Distributed Purging System
Developing a Globally Distributed Purging SystemDeveloping a Globally Distributed Purging System
Developing a Globally Distributed Purging System
 
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
Interative Traffic Engineering in Changing Internet Economics - Tom Daly at L...
 
What we can learn from CDNs about Web Development, Deployment, and Performance
What we can learn from CDNs about Web Development, Deployment, and PerformanceWhat we can learn from CDNs about Web Development, Deployment, and Performance
What we can learn from CDNs about Web Development, Deployment, and Performance
 
Rails Caching: Secrets From the Edge
Rails Caching: Secrets From the EdgeRails Caching: Secrets From the Edge
Rails Caching: Secrets From the Edge
 
Building a better web
Building a better webBuilding a better web
Building a better web
 
The future of the edge
The future of the edge The future of the edge
The future of the edge
 
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
 
Making ops life easier
Making ops life easierMaking ops life easier
Making ops life easier
 
From Zero to Capacity Planning
From Zero to Capacity PlanningFrom Zero to Capacity Planning
From Zero to Capacity Planning
 
Debugging Your CDN - Austin Spires at Fastly Altitude 2015
Debugging Your CDN - Austin Spires at Fastly Altitude 2015Debugging Your CDN - Austin Spires at Fastly Altitude 2015
Debugging Your CDN - Austin Spires at Fastly Altitude 2015
 

Similar to It Probably Works - QCon 2015

Apex code benchmarking
Apex code benchmarkingApex code benchmarking
Apex code benchmarking
purushottambhaigade
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
Venkatesh Iyer
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure Eman magdy
 
Conf orm - explain
Conf orm - explainConf orm - explain
Conf orm - explain
Louise Grandjonc
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
Brian Brazil
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
NUS-ISS
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
InfluxData
 
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The CloudMongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB
 
Lunch session: Quantum Computing
Lunch session: Quantum ComputingLunch session: Quantum Computing
Lunch session: Quantum Computing
Rolf Huisman
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
thelabdude
 
Visual diagnostics at scale
Visual diagnostics at scaleVisual diagnostics at scale
Visual diagnostics at scale
Rebecca Bilbro
 
Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022
SkillCertProExams
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Wagston Staehler
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
P. Taylor Goetz
 
Lecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptxLecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptx
NatKell
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
Debasish Ghosh
 
ScalaCheck
ScalaCheckScalaCheck
ScalaCheckBeScala
 
Akka with Scala
Akka with ScalaAkka with Scala
Akka with Scala
Oto Brglez
 
Types and Immutability: why you should care
Types and Immutability: why you should careTypes and Immutability: why you should care
Types and Immutability: why you should care
Jean Carlo Emer
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015
lpgauth
 

Similar to It Probably Works - QCon 2015 (20)

Apex code benchmarking
Apex code benchmarkingApex code benchmarking
Apex code benchmarking
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure
 
Conf orm - explain
Conf orm - explainConf orm - explain
Conf orm - explain
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
 
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The CloudMongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
MongoDB World 2019: Event Horizon: Meet Albert Einstein As You Move To The Cloud
 
Lunch session: Quantum Computing
Lunch session: Quantum ComputingLunch session: Quantum Computing
Lunch session: Quantum Computing
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Visual diagnostics at scale
Visual diagnostics at scaleVisual diagnostics at scale
Visual diagnostics at scale
 
Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Lecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptxLecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptx
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
ScalaCheck
ScalaCheckScalaCheck
ScalaCheck
 
Akka with Scala
Akka with ScalaAkka with Scala
Akka with Scala
 
Types and Immutability: why you should care
Types and Immutability: why you should careTypes and Immutability: why you should care
Types and Immutability: why you should care
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015
 

More from Fastly

Revisiting HTTP/2
Revisiting HTTP/2Revisiting HTTP/2
Revisiting HTTP/2
Fastly
 
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Altitude San Francisco 2018: Preparing for Video Streaming Events at ScaleAltitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Fastly
 
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Altitude San Francisco 2018: Building the Souther Hemisphere of the InternetAltitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Fastly
 
Altitude San Francisco 2018: The World Cup Stream
Altitude San Francisco 2018: The World Cup StreamAltitude San Francisco 2018: The World Cup Stream
Altitude San Francisco 2018: The World Cup Stream
Fastly
 
Altitude San Francisco 2018: We Own Our Destiny
Altitude San Francisco 2018: We Own Our DestinyAltitude San Francisco 2018: We Own Our Destiny
Altitude San Francisco 2018: We Own Our Destiny
Fastly
 
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Fastly
 
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless MigrationAltitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Fastly
 
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
Altitude San Francisco 2018: Bringing TLS to GitHub PagesAltitude San Francisco 2018: Bringing TLS to GitHub Pages
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
Fastly
 
Altitude San Francisco 2018: HTTP Invalidation Workshop
Altitude San Francisco 2018: HTTP Invalidation WorkshopAltitude San Francisco 2018: HTTP Invalidation Workshop
Altitude San Francisco 2018: HTTP Invalidation Workshop
Fastly
 
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and WoeAltitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Fastly
 
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Fastly
 
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per dayAltitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Fastly
 
Altitude San Francisco 2018: Authentication at the Edge
Altitude San Francisco 2018: Authentication at the EdgeAltitude San Francisco 2018: Authentication at the Edge
Altitude San Francisco 2018: Authentication at the Edge
Fastly
 
Altitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & ApplicationsAltitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & Applications
Fastly
 
Altitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly WorkshopAltitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly Workshop
Fastly
 
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORKAltitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Fastly
 
Altitude San Francisco 2018: WAF Workshop
Altitude San Francisco 2018: WAF WorkshopAltitude San Francisco 2018: WAF Workshop
Altitude San Francisco 2018: WAF Workshop
Fastly
 
Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge
Fastly
 
Altitude San Francisco 2018: Video Workshop Docs
Altitude San Francisco 2018: Video Workshop DocsAltitude San Francisco 2018: Video Workshop Docs
Altitude San Francisco 2018: Video Workshop Docs
Fastly
 
Altitude San Francisco 2018: Programming the Edge
Altitude San Francisco 2018: Programming the EdgeAltitude San Francisco 2018: Programming the Edge
Altitude San Francisco 2018: Programming the Edge
Fastly
 

More from Fastly (20)

Revisiting HTTP/2
Revisiting HTTP/2Revisiting HTTP/2
Revisiting HTTP/2
 
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Altitude San Francisco 2018: Preparing for Video Streaming Events at ScaleAltitude San Francisco 2018: Preparing for Video Streaming Events at Scale
Altitude San Francisco 2018: Preparing for Video Streaming Events at Scale
 
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Altitude San Francisco 2018: Building the Souther Hemisphere of the InternetAltitude San Francisco 2018: Building the Souther Hemisphere of the Internet
Altitude San Francisco 2018: Building the Souther Hemisphere of the Internet
 
Altitude San Francisco 2018: The World Cup Stream
Altitude San Francisco 2018: The World Cup StreamAltitude San Francisco 2018: The World Cup Stream
Altitude San Francisco 2018: The World Cup Stream
 
Altitude San Francisco 2018: We Own Our Destiny
Altitude San Francisco 2018: We Own Our DestinyAltitude San Francisco 2018: We Own Our Destiny
Altitude San Francisco 2018: We Own Our Destiny
 
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
Altitude San Francisco 2018: Scale and Stability at the Edge with 1.4 Billion...
 
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless MigrationAltitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
Altitude San Francisco 2018: Moving Off the Monolith: A Seamless Migration
 
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
Altitude San Francisco 2018: Bringing TLS to GitHub PagesAltitude San Francisco 2018: Bringing TLS to GitHub Pages
Altitude San Francisco 2018: Bringing TLS to GitHub Pages
 
Altitude San Francisco 2018: HTTP Invalidation Workshop
Altitude San Francisco 2018: HTTP Invalidation WorkshopAltitude San Francisco 2018: HTTP Invalidation Workshop
Altitude San Francisco 2018: HTTP Invalidation Workshop
 
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and WoeAltitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
Altitude San Francisco 2018: HTTP/2 Tales: Discovery and Woe
 
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
Altitude San Francisco 2018: How Magento moved to the cloud while maintaining...
 
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per dayAltitude San Francisco 2018: Scaling Ethereum to 10B requests per day
Altitude San Francisco 2018: Scaling Ethereum to 10B requests per day
 
Altitude San Francisco 2018: Authentication at the Edge
Altitude San Francisco 2018: Authentication at the EdgeAltitude San Francisco 2018: Authentication at the Edge
Altitude San Francisco 2018: Authentication at the Edge
 
Altitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & ApplicationsAltitude San Francisco 2018: WebAssembly Tools & Applications
Altitude San Francisco 2018: WebAssembly Tools & Applications
 
Altitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly WorkshopAltitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly Workshop
 
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORKAltitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
Altitude San Francisco 2018: Fastly Purge Control at the USA TODAY NETWORK
 
Altitude San Francisco 2018: WAF Workshop
Altitude San Francisco 2018: WAF WorkshopAltitude San Francisco 2018: WAF Workshop
Altitude San Francisco 2018: WAF Workshop
 
Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge
 
Altitude San Francisco 2018: Video Workshop Docs
Altitude San Francisco 2018: Video Workshop DocsAltitude San Francisco 2018: Video Workshop Docs
Altitude San Francisco 2018: Video Workshop Docs
 
Altitude San Francisco 2018: Programming the Edge
Altitude San Francisco 2018: Programming the EdgeAltitude San Francisco 2018: Programming the Edge
Altitude San Francisco 2018: Programming the Edge
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

It Probably Works - QCon 2015

  • 2. Tyler McMullen CTO of Fastly @tbmcmullen
  • 4. What is a probabilistic algorithm?
  • 6. –Hal Abelson and Gerald J. Sussman, SICP “In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a 'correct' algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering.”
  • 10. Don’t use them if someone’s life depends on it.
  • 11. Don’t use them if someone’s life depends on it.
  • 12. When should I use these?
  • 15. The Problem How many unique words are in a large corpus of text?
  • 16. The Problem How many different users visited a popular website in a day?
  • 17. The Problem How many unique IPs have connected to a server over the last hour?
  • 18. The Problem How many unique URLs have been requested through an HTTP proxy?
  • 19. The Problem How many unique URLs have been requested through an entire network of HTTP proxies?
  • 20. 166.208.249.236  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /product/982"  200  20246   103.138.203.165  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /article/490"  200  29870   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /page/1"  200  20409   150.255.232.154  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/1"  200  42999   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /article/490"  200  25080   150.255.232.154  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /listing/567"  200  33617   103.138.203.165  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /listing/567"  200  29618   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /page/1"  200  30265   166.208.249.236  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/1"  200  5683   244.210.202.222  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /article/490"  200  47124   103.138.203.165  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /listing/567"  200  48734   103.138.203.165  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /listing/567"  200  27392   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /listing/567"  200  15705   150.255.232.154  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/1"  200  22587   244.210.202.222  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /product/982"  200  30063   244.210.202.222  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/1"  200  6041   166.208.249.236  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /product/982"  200  25783   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /article/490"  200  1099   244.210.202.222  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /product/982"  200  31494   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /listing/567"  200  30389   150.255.232.154  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /article/490"  200  10251   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /product/982"  200  19384   150.255.232.154  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /product/982"  200  24062   244.210.202.222  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /article/490"  200  19070   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/648"  200  45159   191.141.247.227  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "HEAD  /page/648"  200  5576   166.208.249.236  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /page/648"  200  41869   166.208.249.236  -­‐  -­‐  [25/Feb/2015:07:20:13  +0000]  "GET  /listing/567"  200  42414  
  • 21. def  count_distinct(stream):          seen  =  set()          for  item  in  stream:                  seen.add(item)          return  len(seen)
  • 24. def  combined_cardinality(seen_sets):          combined  =  set()          for  seen  in  seen_sets:                  combined  |=  seen          return  len(combined)
  • 25. The set grows linearly.
  • 27.
  • 28.
  • 30. 1  0  1  1  0  1  0  1 “example.com/user/page/1”
  • 31. 1  0  1  1  0  1  0  1
  • 32. 1  0  1  1  0  1  0  1 P(bit  0  is  set)  =  0.5
  • 33. 1  0  1  1  0  1  0  1 P(bit  0  is  set   &  bit  1  is  set)  =  0.25
  • 34. 1  0  1  1  0  1  0  1 P(bit  0  is  set     &  bit  1  is  set   &  bit  2  is  set)  =  0.125
  • 35. 1  0  1  1  0  1  0  1 P(bit  0  is  set)  =  0.5   Expected  trials  =  2
  • 36. 1  0  1  1  0  1  0  1 P(bit  0  is  set   &  bit  1  is  set)  =  0.25   Expected  trials  =  4
  • 37. 1  0  1  1  0  1  0  1 P(bit  0  is  set     &  bit  1  is  set   &  bit  2  is  set)  =  0.125   Expected  trials  =  8
  • 38. We expect the maximum number of leading zeros we have seen + 1 to approximate log2(unique items).
  • 39. Improve the accuracy of the estimate by partitioning the input data.
  • 40. class  LogLog(object):          def  __init__(self,  k):                  self.k  =  k                  self.m  =  2  **  k                  self.M  =  np.zeros(self.m,  dtype=np.int)                  self.alpha  =  Alpha[k]              def  insert(self,  token):                  y  =  hash_fn(token)                  j  =  y  >>  (hash_len  -­‐  self.k)                  remaining  =  y  &  ((1  <<  (hash_len  -­‐  self.k))  -­‐  1)                  first_set_bit  =  (64  -­‐  self.k)  -­‐                                                  int(math.log(remaining,  2))                  self.M[j]  =  max(self.M[j],  first_set_bit)              def  cardinality(self):                  return  self.alpha  *  2  **  np.mean(self.M)
  • 42. HyperLogLog • Adding an item: O(1) • Retrieving cardinality: O(1) • Space: O(log log n) • Error rate: 2%
  • 43. HyperLogLog For 100 million unique items, and an error rate of 2%, the size of the HyperLogLog is... 1,500 bytes
  • 45. The Problem Reliably broadcast “purge” messages across the world as quickly as possible.
  • 50.
  • 51.
  • 53.
  • 54.
  • 57.
  • 58. Bimodal Multicast • Quickly broadcast message to all servers • Gossip to recover lost messages
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65. One Problem Computers have limited space
  • 67.
  • 68.
  • 69.
  • 73. End-to-End Latency 42ms 74ms 83ms 133ms New York London San Jose Tokyo 0.00 0.05 0.10 0.00 0.05 0.10 0.00 0.05 0.10 0.00 0.05 0.10 0 50 100 150 Latency (ms) Density Density plot and 95th percentile of purge latency by server location
  • 76. What was the point again?
  • 77. We can build things that are otherwise unrealistic
  • 78. We can build systems that are more reliable
  • 82. What even is this?
  • 86. Probabilistic Algorithms 1. An iota of theory 2. Where are they useful and where are they not? 3. HyperLogLog 4. Locality-sensitive Hashing 5. Bimodal Multicast
  • 87. “An algorithm that uses randomness to improve its efficiency”
  • 90. Las Vegas def  find_las_vegas(haystack,  needle):          length  =  len(haystack)          while  True:                  index  =  randrange(length)                  if  haystack[index]  ==  needle:                          return  index
  • 91. Monte Carlo def  find_monte_carlo(haystack,  needle,  k):          length  =  len(haystack)          for  i  in  range(k):                  index  =  randrange(length)                  if  haystack[index]  ==  needle:                          return  index
  • 92. – Prabhakar Raghavan (author of Randomized Algorithms) “For many problems a randomized algorithm is the simplest the fastest or both.”
  • 93. Naive Solution For 100 million unique IPv4 addresses, the size of the hash is... >400mb
  • 94. Slightly Less Naive Add each IP to a bloom filter and keep a counter of the IPs that don’t collide.
  • 95. Slightly Less Naive ips_seen  =  BloomFilter(capacity=expected_size,  error_rate=0.03)   counter  =  0       for  line  in  log_file:          ip  =  extract_ip(line)          if  items_bloom.add(ip):                  counter  +=  1       print  "Unique  IPs:",  counter
  • 96. Slightly Less Naive • Adding an IP: O(1) • Retrieving cardinality: O(1) • Space: O(n) • Error rate: 3% kind of
  • 97. Slightly Less Naive For 100 million unique IPv4 addresses, and an error rate of 3%, the size of the bloom filter is... 87mb
  • 98. def  insert(self,  token):          #  Get  hash  of  token          y  =  hash_fn(token)              #  Extract  `k`  most  significant  bits  of  `y`          j  =  y  >>  (hash_len  -­‐  self.k)              #  Extract  remaining  bits  of  `y`          remaining  =  y  &  ((1  <<  (hash_len  -­‐  self.k))  -­‐  1)              #  Find  "first"  set  bit  of  `remaining`          first_set_bit  =  (64  -­‐  self.k)  -­‐                                          int(math.log(remaining,  2))              #  Update  `M[j]`  to  max  of  `first_set_bit`          #  and  existing  value  of  `M[j]`          self.M[j]  =  max(self.M[j],  first_set_bit)
  • 99. def  cardinality(self):          #  The  mean  of  `M`  estimates  `log2(n)`  with          #  an  additive  bias          return  self.alpha  *  2  **  np.mean(self.M)
  • 100. The Problem Find documents that are similar to one specific document.
  • 101. The Problem Find images that are similar to one specific image.
  • 102. The Problem Find graphs that are correlated to one specific graph.
  • 104. The Problem “Find the n closest points in a d-dimensional space.”
  • 105. The Problem You have a bunch of things and you want to figure out which ones are similar.
  • 106.
  • 107. “There has been a lot of recent work on streaming algorithms, i.e. algorithms that produce an output by making one pass (or a few passes) over the data while using a limited amount of storage space and time. To cite a few examples, ...” {  "there":  1,  "has":  1,  "been":    1,  “a":4,      "lot":      1,  "of":    2,  "recent":1,  ...  }
  • 108. • Cosine similarity • Jaccard similarity • Euclidian distance • etc etc etc
  • 114.
  • 115.
  • 116. Locality-Sensitive Hashing for Finding Nearest Neighbors - Slaney and Casey
  • 118. {  "there":  1,  "has":  1,  "been":    1,  “a":4,      "lot":      1,  "of":    2,  "recent":1,  ...  } LSH 0  1  1  1  0  1  0  0  0  0  1  0  1  0  1  0  ...