SlideShare a Scribd company logo
1 of 16
PROBABILISTIC DATA
STRUCTURES IN REAL LIFE
Valentin Bazarevsky
WHO THEY ARE?
Bloom Filter
LogLog Family
MinHash
BUSINESS CASE:
ESTIMATE YOUR AUDIENCE
SEGMENT BUILDER
15 Tb of transactional data
4h SLA
POSSIBLE SOLUTIONS
Brute force (15 TB of transactional data)
Sampling (1 % of users => 1.2 mb / b.o.)
Magic tool (?!)
Estimator
HyperLogLog allows to estimate > 1 000 000 000 sets of unique
elements with 1% error, and requires only 4kb memory
50 000 000 basic operations
OOPS…
Supports only Unions
But we need Intersections, Subtractions, Not
operators
HYPERLOGLOG INTUITION
00101010101010001111010101101 => a[2] = 0
10010101010100101010101001011 => a[9] = 1
00000101010100101010101110101 => a[0] = 1
01010101010100100101010101010 => a[5] = 1
01010000000000000000000000010 => a[5] = 23
INCLUSION-EXCLUSION PRINCIPLE
MINHASH
Store only x (8192) smallest hashes in set
Jaccard Distance
UNION OF INTERSECTIONS
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ B)
A - B - C = A - (B ∪ C)
NOT OPERATOR
Subtraction
I WANT EVERYONE EXCEPT…
A and not B
Not A and Not B
CORNER CASES
|(A ∪ not(B)) ∩ C| => |A ∩ C|
|A ∪ not(B)| = |Everything| - |B| + |A ∩ B|
|A ∩ not(B)| => |A| - |A ∩ B|
ARCHITECTURE
ERROR RATE
Median = 5%
Percentile 75 = 8%
Probabilistic data structures in real life

More Related Content

Viewers also liked

Child Labor in Philippines
Child Labor in PhilippinesChild Labor in Philippines
Child Labor in Philippinesnoahh64
 
Science10 h permanentice
Science10 h permanenticeScience10 h permanentice
Science10 h permanenticee_mcgaffney
 
What we like and what we don´t like
What we like and what we don´t likeWhat we like and what we don´t like
What we like and what we don´t likeJuanmaProfe
 
Pinky dinky doo
Pinky dinky doo Pinky dinky doo
Pinky dinky doo karitochoco
 
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2โครงการตรวจสอบครุภัณฑ์ ปวช.2/2
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2Chutiporn Ap
 
It’s about time revised
It’s about time revisedIt’s about time revised
It’s about time revisedBrittknee Basch
 
Разработка средств управления и мониторинга распределенной мультиагентной сис...
Разработка средств управления и мониторинга распределенной мультиагентной сис...Разработка средств управления и мониторинга распределенной мультиагентной сис...
Разработка средств управления и мониторинга распределенной мультиагентной сис...Valentin Bazarevsky
 
Skolačka
SkolačkaSkolačka
Skolačkaevite
 
One day in our life
One day in our lifeOne day in our life
One day in our lifeJuanmaProfe
 
Day 7 powerpoint time on a clock
Day 7 powerpoint time on a clockDay 7 powerpoint time on a clock
Day 7 powerpoint time on a clockBrittknee Basch
 
Typical dishes from spain
Typical dishes from spainTypical dishes from spain
Typical dishes from spainJuanmaProfe
 
La competencia de comprension lectora en estudiantes de nivel medio superior
La competencia de comprension lectora en estudiantes de nivel medio superiorLa competencia de comprension lectora en estudiantes de nivel medio superior
La competencia de comprension lectora en estudiantes de nivel medio superiorAdelina (Ade) Salguero Flores
 

Viewers also liked (19)

Muazzam_mirza[1]
Muazzam_mirza[1]Muazzam_mirza[1]
Muazzam_mirza[1]
 
Half life
Half lifeHalf life
Half life
 
Child Labor in Philippines
Child Labor in PhilippinesChild Labor in Philippines
Child Labor in Philippines
 
Science10 h permanentice
Science10 h permanenticeScience10 h permanentice
Science10 h permanentice
 
Can you :)
Can you :)Can you :)
Can you :)
 
Klimt
KlimtKlimt
Klimt
 
What we like and what we don´t like
What we like and what we don´t likeWhat we like and what we don´t like
What we like and what we don´t like
 
Pinky dinky doo
Pinky dinky doo Pinky dinky doo
Pinky dinky doo
 
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2โครงการตรวจสอบครุภัณฑ์ ปวช.2/2
โครงการตรวจสอบครุภัณฑ์ ปวช.2/2
 
Nelson
NelsonNelson
Nelson
 
It’s about time revised
It’s about time revisedIt’s about time revised
It’s about time revised
 
Story to U MaM
Story to U MaMStory to U MaM
Story to U MaM
 
Total ranks
Total ranksTotal ranks
Total ranks
 
Разработка средств управления и мониторинга распределенной мультиагентной сис...
Разработка средств управления и мониторинга распределенной мультиагентной сис...Разработка средств управления и мониторинга распределенной мультиагентной сис...
Разработка средств управления и мониторинга распределенной мультиагентной сис...
 
Skolačka
SkolačkaSkolačka
Skolačka
 
One day in our life
One day in our lifeOne day in our life
One day in our life
 
Day 7 powerpoint time on a clock
Day 7 powerpoint time on a clockDay 7 powerpoint time on a clock
Day 7 powerpoint time on a clock
 
Typical dishes from spain
Typical dishes from spainTypical dishes from spain
Typical dishes from spain
 
La competencia de comprension lectora en estudiantes de nivel medio superior
La competencia de comprension lectora en estudiantes de nivel medio superiorLa competencia de comprension lectora en estudiantes de nivel medio superior
La competencia de comprension lectora en estudiantes de nivel medio superior
 

Recently uploaded

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

Probabilistic data structures in real life