fooTalks Increasing you knowledge through sharing Tell what you know, hear what others know Expect it to happen periodically, every week Expects only a limited audience Expects contributing audience Intrest level rises, can bring experts Need volunteered speakers
FooTalks #1 Agenda Introduction Explanation Demo Algorithms Q&A
Agenda Bloom Filters
Introduction Bloom filters are compact data structures for probabilistic representation of a set in order to support membership queries (i.e. queries that ask: “Is element X in set  Y ?”).  This compact representation is the payoff for allowing a small rate of  false positives  in membership queries; that is, queries might incorrectly recognize an element as member of the set.
In a simple way.. A datastructure to represent presence of an element in a set With bloom filters you can check if an element is present in the set or not Allows false positives: will say yes even if no Never allow false negatives: will never say no if yes Ie {a,b,c,d} check for 'z' might give yes but check for 'a' will never give a no
What else you need? f(”hello”) = 9 f(”fooTalks”) = 23 hash function(s): given an i/p gives a numerical o/p
How its represented A bit vector 0 1 2 3 4 5
And how it works? f(”hello”) = 5->map this to our bit vector Ie, set index 5 in bit vector 0 1 2 3 4 5
If index exceeds Use mod!, why the hell did u learn it for? f('fooTalks') = 9%5=4 0 1 2 3 4 5
And what else? Multiple hashes can map to same index f(”hello”) = 5 & f(”foo”)=5 This leads to false positives
And you now know why no false negs If not, get out of here...
Usage Simple Spell Checker Or i could complicate, used in networks,bioinformatics,blah..blah..blah..
Enough with the talking.... Demo Code available at: github.com/jesly.varghese
Algorithm: Setting up BF Procedure BloomFilter(set A, hash_functions, integer m) returns filter filter = allocate  m  bits initialized to 0 foreach  a i  in  A : foreach hash function  h j : filter[ h j (a i ) ] = 1 end foreach end foreach return filter
Algorithm: Membership Test Procedure  MembershipTest (elm, filter, hash_functions)  returns  yes/no foreach  hash function  h j : if  filter[ h j (elm) ] != 1  return  No end foreach return  Yes
Q&A I hate this part  Well you can ask, i can try answering...
fooTalks Ends May the force be with you

Footalks#1 Bloom Filters

  • 1.
    fooTalks Increasing youknowledge through sharing Tell what you know, hear what others know Expect it to happen periodically, every week Expects only a limited audience Expects contributing audience Intrest level rises, can bring experts Need volunteered speakers
  • 2.
    FooTalks #1 AgendaIntroduction Explanation Demo Algorithms Q&A
  • 3.
  • 4.
    Introduction Bloom filtersare compact data structures for probabilistic representation of a set in order to support membership queries (i.e. queries that ask: “Is element X in set Y ?”). This compact representation is the payoff for allowing a small rate of false positives in membership queries; that is, queries might incorrectly recognize an element as member of the set.
  • 5.
    In a simpleway.. A datastructure to represent presence of an element in a set With bloom filters you can check if an element is present in the set or not Allows false positives: will say yes even if no Never allow false negatives: will never say no if yes Ie {a,b,c,d} check for 'z' might give yes but check for 'a' will never give a no
  • 6.
    What else youneed? f(”hello”) = 9 f(”fooTalks”) = 23 hash function(s): given an i/p gives a numerical o/p
  • 7.
    How its representedA bit vector 0 1 2 3 4 5
  • 8.
    And how itworks? f(”hello”) = 5->map this to our bit vector Ie, set index 5 in bit vector 0 1 2 3 4 5
  • 9.
    If index exceedsUse mod!, why the hell did u learn it for? f('fooTalks') = 9%5=4 0 1 2 3 4 5
  • 10.
    And what else?Multiple hashes can map to same index f(”hello”) = 5 & f(”foo”)=5 This leads to false positives
  • 11.
    And you nowknow why no false negs If not, get out of here...
  • 12.
    Usage Simple SpellChecker Or i could complicate, used in networks,bioinformatics,blah..blah..blah..
  • 13.
    Enough with thetalking.... Demo Code available at: github.com/jesly.varghese
  • 14.
    Algorithm: Setting upBF Procedure BloomFilter(set A, hash_functions, integer m) returns filter filter = allocate m bits initialized to 0 foreach a i in A : foreach hash function h j : filter[ h j (a i ) ] = 1 end foreach end foreach return filter
  • 15.
    Algorithm: Membership TestProcedure MembershipTest (elm, filter, hash_functions) returns yes/no foreach hash function h j : if filter[ h j (elm) ] != 1 return No end foreach return Yes
  • 16.
    Q&A I hatethis part Well you can ask, i can try answering...
  • 17.
    fooTalks Ends Maythe force be with you