SlideShare a Scribd company logo
Optimal Binary Search
Tree
Le Phong Vu
About me
▶ Le Phong Vu
▶ Member of Grokking
▶ Data structure and caching system
▶ Email: lephongvu90@gmail.com
2
Agenda
▶ Problem: Build a dictionary online with fast lookup
▶ Approach to apply OBST in dictionary application
▶ Use Dynamic programming to find OBST
▶ Optimize solution to reduce time create OBST
▶ Application of OBST
3
Problem: Build a dictionary
▶ Given an English world, you want to search its meaning quickly
4
Build a dictionary with BST
apple
hatch
mango
pizzajump
Balanced binary search tree
Worst-case O(log n) time per query
Total cost: 2000100
Word Total
search
Depth of
Node
Cost
apple 1.000.000 2 1000000 * 2
hatch 100 1 100 * 1
jump,
mango,
pizza
0 * 0
5
Re-structure BST
apple
hatch
…
Re-structure BST, move “apple” node
to root
Total cost: 1000200
Word Total search Depth of
Node
Cost
apple 1.000.000 1 1.000.000 * 1
hatch 100 2 100 * 2
jump,
mango,
pizza
0 * 0
OBTIMAL BINARY
SEARCH TREE
6
Approach 7
Build a
balanced
BST
Collect
search
information
Calculate
probability
search
Calculate a
Optimal BST
Change
current BST
by new
Optimal BST
Key and probability
▶  
word key probability
apple
hatch
jump
mango
8
Dummy Key
Meaning Symbol
d0
d1
d2
di
d0 d1
d2 d3 d5d4
apple
hatch
jump
mango
pizza< apple (apple, hatch)
9
Example 10
word key total search probability
< apple 5
apple 10
(apple, hatch) … …
hatch … …
jump … …
mango … …
Total search = 100
Search time
▶  
11
Optimal Binary Search Tree
▶ Given the set probabilities p and q, can we construct a
binary search tree whose expected search time is minimized?
▶ This tree is called an Optimal Binary Search Tree
12
Dynamic Programming
▶ Solves problems by combining the solutions to sub-problems.
13
Figure 1: Dynamic Programming
https://itnext.io/dynamic-programming-vs-divid
e-and-conquer-2fea680becbe
Step 1: The structure of OBST
▶ If an OBST has a sub-tree T, then this sub-tree T must be optimal BST.
14
Optimal
BST
Optimal
BST
Optimal
BST
Step 2: Deciding root of OBST
▶  
15
Step 2: Deciding Root of OBST
▶  
16
kr
Contribute
e [i, r-1]
Contribute
e [i, r-1]
Contribute
W[i,j]
Step 3: Bottom-up approach
▶  
17
 
18
Optimize solution
▶  
19
Application
▶ Huffman Tree
▶ Optimal routing lookup
20
Reference
▶ Le Phong Vu, “Dùng kỹ thuật quy hoạch động giải bài toán tối ưu
cây nhị phân tìm kiếm”
https://engineering.grokking.org/optimal-binary-search-tree/
▶ D. Knuth, “Optimum binary search trees”, Acta Informatica
www.inrg.csie.ntu.edu.tw/algorithm2014/presentation/Knuth71.pdf
▶ Thomas H. Cormen, “Introduction to algorithm”, MIT Press, 3rd
Edition
▶ D. Knuth, “The art of computer programming”, volume 3
▶ Kurt Mehlhorn, “Nearly Optimal Binary Search Tree”
https://people.mpi-inf.mpg.de/~mehlhorn/ftp/mehlhorn3.pdf
21
Question
22
Example 23
An example from book:
Introduction to Algorithm, 3rd
edition
Example 24
Pseudocode 25

More Related Content

What's hot

What's hot (20)

Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
12. Heaps - Data Structures using C++ by Varsha Patil
12. Heaps - Data Structures using C++ by Varsha Patil12. Heaps - Data Structures using C++ by Varsha Patil
12. Heaps - Data Structures using C++ by Varsha Patil
 
0 1 knapsack using branch and bound
0 1 knapsack using branch and bound0 1 knapsack using branch and bound
0 1 knapsack using branch and bound
 
Divide&Conquer & Dynamic Programming
Divide&Conquer & Dynamic ProgrammingDivide&Conquer & Dynamic Programming
Divide&Conquer & Dynamic Programming
 
NLTK
NLTKNLTK
NLTK
 
Algorithms - "Chapter 2 getting started"
Algorithms - "Chapter 2 getting started"Algorithms - "Chapter 2 getting started"
Algorithms - "Chapter 2 getting started"
 
Artificial Intelligence Lab File
Artificial Intelligence Lab FileArtificial Intelligence Lab File
Artificial Intelligence Lab File
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Ai lab manual
Ai lab manualAi lab manual
Ai lab manual
 
Longest Common Subsequence
Longest Common SubsequenceLongest Common Subsequence
Longest Common Subsequence
 
Naive String Matching Algorithm | Computer Science
Naive String Matching Algorithm | Computer ScienceNaive String Matching Algorithm | Computer Science
Naive String Matching Algorithm | Computer Science
 
Abstractive Text Summarization
Abstractive Text SummarizationAbstractive Text Summarization
Abstractive Text Summarization
 
Recursion
RecursionRecursion
Recursion
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Dynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain MultiplicationDynamic Programming - Matrix Chain Multiplication
Dynamic Programming - Matrix Chain Multiplication
 
Aoa amortized analysis
Aoa amortized analysisAoa amortized analysis
Aoa amortized analysis
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
Greedy algorithms
Greedy algorithmsGreedy algorithms
Greedy algorithms
 
Chapter 6 intermediate code generation
Chapter 6   intermediate code generationChapter 6   intermediate code generation
Chapter 6 intermediate code generation
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 

Similar to Grokking TechTalk #27: Optimal Binary Search Tree

The Language of Compression - Leif Walsh
The Language of Compression - Leif WalshThe Language of Compression - Leif Walsh
The Language of Compression - Leif Walsh
Two Sigma
 
Chapter 3 linear programming
Chapter 3   linear programmingChapter 3   linear programming
Chapter 3 linear programming
sarkissk
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
 

Similar to Grokking TechTalk #27: Optimal Binary Search Tree (20)

The Language of Compression
The Language of CompressionThe Language of Compression
The Language of Compression
 
The Language of Compression - Leif Walsh
The Language of Compression - Leif WalshThe Language of Compression - Leif Walsh
The Language of Compression - Leif Walsh
 
0-introduction.pdf
0-introduction.pdf0-introduction.pdf
0-introduction.pdf
 
Unleashing the Power of Python Using the New Minitab/Python Integration Modul...
Unleashing the Power of Python Using the New Minitab/Python Integration Modul...Unleashing the Power of Python Using the New Minitab/Python Integration Modul...
Unleashing the Power of Python Using the New Minitab/Python Integration Modul...
 
Chapter 3 linear programming
Chapter 3   linear programmingChapter 3   linear programming
Chapter 3 linear programming
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ Fyber
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Applying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPKApplying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPK
 
Stuart Mitchell - Pulp Optimisation
Stuart Mitchell - Pulp OptimisationStuart Mitchell - Pulp Optimisation
Stuart Mitchell - Pulp Optimisation
 
Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech scala-performance-tuning-150323223738-conversion-gate01
 
Adtech x Scala x Performance tuning
Adtech x Scala x Performance tuningAdtech x Scala x Performance tuning
Adtech x Scala x Performance tuning
 
Precomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamPrecomputing recommendations with Apache Beam
Precomputing recommendations with Apache Beam
 
SoC HPC: Design, Optimization, and Application to Algorithmic Trading
SoC HPC: Design, Optimization, and Application to Algorithmic TradingSoC HPC: Design, Optimization, and Application to Algorithmic Trading
SoC HPC: Design, Optimization, and Application to Algorithmic Trading
 
Parallel computing in Python: Current state and recent advances
Parallel computing in Python: Current state and recent advancesParallel computing in Python: Current state and recent advances
Parallel computing in Python: Current state and recent advances
 
Cloud Backup Overview
Cloud Backup Overview Cloud Backup Overview
Cloud Backup Overview
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
Parallel computing in Python: Current state and recent advances
Parallel computing in Python: Current state and recent advancesParallel computing in Python: Current state and recent advances
Parallel computing in Python: Current state and recent advances
 
Cornerstones of Cost Accounting 1st Edition Hansen Test Bank
Cornerstones of Cost Accounting 1st Edition Hansen Test BankCornerstones of Cost Accounting 1st Edition Hansen Test Bank
Cornerstones of Cost Accounting 1st Edition Hansen Test Bank
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 

More from Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
Grokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking VN
 

More from Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Motion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in TechnologyMotion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in Technology
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 

Grokking TechTalk #27: Optimal Binary Search Tree