SlideShare a Scribd company logo
1 of 43
PHP DataStructures – Beyond SPL
A dreamscape made from random noise. Illustration: Google
DataStructures
A data structure is a particular way of organizing data in a
computer so that it can be used efficiently.
Different kinds of data structures are suited to different kinds
of applications, and some are highly specialized to specific
tasks.
DataStructures in PHP
• Some basic DataStructures available in PHP’s SPL
• Stack
• Queue
• Heap
• Doubly-Linked List
• Fixed Array
• SPL Object Storage
• SPL is the Standard PHP Library
• (Yet another recursive acronym)
DataStructures
• Some additional DataStructures that don’t exist in core PHP
• Tries
• QuadTrees
Tries
Tries
• A Tree structure comprising a hierarchy of “indexed” nodes
• Each node can contain:
• A series of pointers (keys) to the next node in the hierarchy
• A bucket for data values
• This allows for multiple values with the same key
• There are three basic types of Tries:
• Tries
• Radix Tries
• Suffix Tries
Tries – Purpose
• Fast lookup with a partial key
• Example implementation
https://github.com/MarkBaker/Tries
Tries – Uses
• Replacement for PHP Arrays (Hashmaps)
• No key collisions
• Duplicate Keys supported
• No Hashing function required
• Partial Key Lookups
• Predictive Text
• Autocomplete
• Spell-Checking
• Hyphen-isation
Tries – Methods
• add($key, $value = null)
Adds new data to a Trie
• search($prefix)
Find data in a Trie
• delete($key)
• isNode($key)
• isMember($key)
Tries – Basic Trie
• Node pointers comprise a single character or byte
Tries – Basic Trie $trie = new Trie();
$trie->add('cat', 'cat data');
C
A
T
Tries – Basic Trie $trie = new Trie();
$trie->add('cat', 'cat data');
$trie->add('car', 'car data');
C
A
T R
Tries – Basic Trie $trie = new Trie();
$trie->add('cat', 'cat data');
$trie->add('car', 'car data');
$trie->add('cart', 'cart data');
C
A
T R
T
Tries – Basic Trie $trie = new Trie();
$trie->add('cat', 'cat data');
$trie->add('car', 'car data');
$trie->add('cart', 'cart data');
$trie->search('car');
T
T
C
C A
A
R
R
Tries – Basic Trie
• The key to a data node is inherent in the path to that node,
so it is not necessary to store the key
Tries – Radix Trie
• Node pointers comprise one or more characters or bytes
• This means they can be more compact and memory efficient than
a basic Trie
• It can add more overhead to building the Trie
• It may be faster to search the Trie hierarchy
Tries – Radix Trie $radixTrie = new RadixTrie();
$radixTrie->add('cat', 'cat data');
CAT
Tries – Radix Trie $radixTrie = new Trie();
$radixTrie->add('cat', 'cat data');
$radixTrie->add('car', 'car data');
CA
T R
Tries – Radix Trie $radixTrie = new Trie();
$radixTrie->add('cat', 'cat data');
$radixTrie->add('car', 'car data');
$radixTrie->add('cart', 'cart data');
CA
T R
T
Tries – Suffix Trie $suffixTrie = new SuffixTrie();
$suffixTrie->add('cat', 'cat data');
C
A
T
Tries – Suffix Trie $suffixTrie = new SuffixTrie();
$suffixTrie->add('cat', 'cat data');
C
A
T
TA
T
Tries – Suffix Trie $suffixTrie = new SuffixTrie();
$suffixTrie->add('cat', 'cat data');
$suffixTrie->search('at');
C
A
T
T
A T
A
T
Tries – Suffix Tries
• Memory hungry
• n + n-1 + n-2… 2 + 1 nodes (where n is key length) used for every
key/value stored in a Suffix Trie
• Slow to populate
• Can be used to search for “contains” rather than simply
“begins with”
Tries – Suffix Tries
• It is necessary to store the key with the data
• A search can return duplicate values
• e.g. “banana” if we search for “a” or “n” or even “ana”
• Data should only be stored once for the “full word”, and
subsequent sequences should only store a pointer to that
data
QuadTrees
QuadTrees
• A Tree structure that partitions a 2-Dimensional space by
recursively subdividing it into quadrants (or regions)
• Each node can contain:
• A series of pointers (keys) to the next node in the hierarchy
• A bucket for data values
• There are different types of QuadTrees:
• Point QuadTrees
• Region QuadTrees
• Edge QuadTrees
• Polygonal Map (PM) QuadTrees
QuadTrees – Purpose
• Fast Geo-spatial or Graph lookup
• Sparse data compression
• Example implementation
https://github.com/MarkBaker/QuadTrees
QuadTrees – Uses
• Spatial Indexing
• Storing Sparse Data
e.g.
• Spreadsheet format data
• Pixel data in images
• Collision Detection
• Points within a field of vision
QuadTrees – Methods
• insert($xyCoordinate, $value = null)
Adds new data to a QuadTree
• search($boundingBox)
Find data in a QuadTree
QuadTrees – Point QuadTree
• Used for Spatial Indexing
QuadTrees – Spatial Indexing$quadTree = new QuadTree(
-180, 90, 180, -90, // Dimensions
3 // Bucket size
);
-90
90
0
-180 180
$quadTree = new QuadTree(
-180, 90, 180, -90, // Dimensions
3 // Bucket size
);
$quadTree->add('London', 51.5072, -0.1275);
$quadTree->add('New York', 40.7127, - 74.0059);
$quadTree->add('Paris', 48.8567, 2.3508);
QuadTrees – Spatial Indexing
-90
90
0
-180 180
QuadTrees – Spatial Indexing$quadTree = new QuadTree(
-180, 90, 180, -90, // Dimensions
3 // Bucket size
);
$quadTree->add('London', 51.5072, -0.1275);
$quadTree->add('New York', 40.7127, - 74.0059);
$quadTree->add('Paris', 48.8567, 2.3508);
$quadTree->add('Munich', 48.1333, 11.5667);
$quadTree->add('Dublin', 53.3478, 6.2597);
$quadTree->add('Rome', 41.9000, 12.5000);
$quadTree->add('Athens', 37.9667, 23.7167);
-90
90
90
0
0
-180
-180 1800 0
45
90
0
45
180
QuadTrees – Spatial Indexing$quadTree = new QuadTree(
-180, 90, 180, -90, // Dimensions
3 // Bucket size
);
$quadTree->add('London', 51.5072, -0.1275);
$quadTree->add('New York', 40.7127, - 74.0059);
$quadTree->add('Paris', 48.8567, 2.3508);
$quadTree->add('Munich', 48.1333, 11.5667);
$quadTree->add('Dublin', 53.3478, 6.2597);
$quadTree->add('Rome', 41.9000, 12.5000);
$quadTree->add('Athens', 37.9667, 23.7167);
$quadTree->add('Amsterdam', 52.3667, 4.9000);
-90
90
90
0
90
45
0
-180
-180 1800 0
45
90
0
45
180
0 90
$quadTree = new QuadTree(
-180, 90, 180, -90, // Dimensions
3 // Bucket size
);
$quadTree->add('London', 51.5072, -0.1275);
$quadTree->add('New York', 40.7127, - 74.0059);
$quadTree->add('Paris', 48.8567, 2.3508);
$quadTree->add('Munich', 48.1333, 11.5667);
$quadTree->add('Dublin', 53.3478, 6.2597);
$quadTree->add('Rome', 41.9000, 12.5000);
$quadTree->add('Athens', 37.9667, 23.7167);
$quadTree->add('Amsterdam', 52.3667, 4.9000);
…
// Search QuadTree for Northern Europe
$quadTree->find(
-15.0, 60.0,
25.0, 45.0
);
QuadTrees – Spatial Indexing
-90
90
90
0
90
45
45 45
0 0
0
0
45
45
67.5
45 -45
0
-90
-180 180
-180 1800 0 0 180
90
0
45
0 90 0 90 90 180
0 45
QuadTrees – Spatial Indexing
• The top-level node need not be limited to the maximum
graph space (i.e. the whole world)
QuadTrees – Spatial Indexing
QuadTrees – Spatial Indexing
• With a larger bucket size
• QuadTree is smaller, fewer nodes using less memory
• More points need checking in each node
• Faster to insert / slower to search
• With a smaller bucket size
• The QuadTree uses more memory
• Fewer points in each node to check
• Slower to insert / faster to search
QuadTrees – Region QuadTree
• Used for Sparse-data Compression
• Used for Level-based Aggregations
QuadTrees – Image Compression
QuadTrees
• The same principles can be applied to 3-Dimensional space
using an Octree
PHP DataStructures – Beyond SPL
A dreamscape made from random noise. Illustration: Google
Questions
?
Who am I?
Mark Baker
Design and Development Manager
InnovEd (Innovative Solutions for Education) Learning Ltd
Coordinator and Developer of:
Open Source PHPOffice library
PHPExcel, PHPWord, PHPPowerPoint, PHPProject, PHPVisio
Minor contributor to PHP core
Other small open source libraries available on github
@Mark_Baker
https://github.com/MarkBaker
http://uk.linkedin.com/pub/mark-baker/b/572/171

More Related Content

What's hot

PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & Pinba
Patrick Allaert
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
g3_nittala
 

What's hot (20)

PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
 
PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & Pinba
 
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
 
Hive Object Model
Hive Object ModelHive Object Model
Hive Object Model
 
Arrays in PHP
Arrays in PHPArrays in PHP
Arrays in PHP
 
Metaprogramming in Haskell
Metaprogramming in HaskellMetaprogramming in Haskell
Metaprogramming in Haskell
 
Hive - SerDe and LazySerde
Hive - SerDe and LazySerdeHive - SerDe and LazySerde
Hive - SerDe and LazySerde
 
Introductionto fp with groovy
Introductionto fp with groovyIntroductionto fp with groovy
Introductionto fp with groovy
 
Groovy unleashed
Groovy unleashed Groovy unleashed
Groovy unleashed
 
Avro introduction
Avro introductionAvro introduction
Avro introduction
 
Invertible-syntax 入門
Invertible-syntax 入門Invertible-syntax 入門
Invertible-syntax 入門
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
 
Beyond tf idf why, what & how
Beyond tf idf why, what & howBeyond tf idf why, what & how
Beyond tf idf why, what & how
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
Haskell in the Real World
Haskell in the Real WorldHaskell in the Real World
Haskell in the Real World
 
Solr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene EuroconSolr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene Eurocon
 
Living with garbage
Living with garbageLiving with garbage
Living with garbage
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Statistical computing 01
Statistical computing 01Statistical computing 01
Statistical computing 01
 

Viewers also liked

Viewers also liked (9)

Using spl tools in your code
Using spl tools in your codeUsing spl tools in your code
Using spl tools in your code
 
SOLID Principies
SOLID PrincipiesSOLID Principies
SOLID Principies
 
OOP (in portuguese)
OOP (in portuguese)OOP (in portuguese)
OOP (in portuguese)
 
Go OO! - Real-life Design Patterns in PHP 5
Go OO! - Real-life Design Patterns in PHP 5Go OO! - Real-life Design Patterns in PHP 5
Go OO! - Real-life Design Patterns in PHP 5
 
Refactoring
RefactoringRefactoring
Refactoring
 
Aplicando SOLID com PHP7
Aplicando SOLID com PHP7Aplicando SOLID com PHP7
Aplicando SOLID com PHP7
 
Solid principles
Solid principlesSolid principles
Solid principles
 
Masterizing php data structure 102
Masterizing php data structure 102Masterizing php data structure 102
Masterizing php data structure 102
 
Certificacao Php
Certificacao PhpCertificacao Php
Certificacao Php
 

Similar to Php data structures – beyond spl (online version)

Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
Paul Chao
 

Similar to Php data structures – beyond spl (online version) (20)

Cascading introduction
Cascading introductionCascading introduction
Cascading introduction
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
SRV405 Ancestry's Journey to Amazon Redshift
SRV405 Ancestry's Journey to Amazon RedshiftSRV405 Ancestry's Journey to Amazon Redshift
SRV405 Ancestry's Journey to Amazon Redshift
 
Presentation
PresentationPresentation
Presentation
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx
 
Database
DatabaseDatabase
Database
 
Data Exploration in R.pptx
Data Exploration in R.pptxData Exploration in R.pptx
Data Exploration in R.pptx
 
MWLUG Session- AD112 - Take a Trip Into the Forest - A Java Primer on Maps, ...
MWLUG Session-  AD112 - Take a Trip Into the Forest - A Java Primer on Maps, ...MWLUG Session-  AD112 - Take a Trip Into the Forest - A Java Primer on Maps, ...
MWLUG Session- AD112 - Take a Trip Into the Forest - A Java Primer on Maps, ...
 
Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.
 
managing big data
managing big datamanaging big data
managing big data
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
Lecture 9.pptx
Lecture 9.pptxLecture 9.pptx
Lecture 9.pptx
 
Python for data analysis
Python for data analysisPython for data analysis
Python for data analysis
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 

More from Mark Baker

More from Mark Baker (20)

Looping the Loop with SPL Iterators
Looping the Loop with SPL IteratorsLooping the Loop with SPL Iterators
Looping the Loop with SPL Iterators
 
Looping the Loop with SPL Iterators
Looping the Loop with SPL IteratorsLooping the Loop with SPL Iterators
Looping the Loop with SPL Iterators
 
Looping the Loop with SPL Iterators
Looping the Loop with SPL IteratorsLooping the Loop with SPL Iterators
Looping the Loop with SPL Iterators
 
Deploying Straight to Production
Deploying Straight to ProductionDeploying Straight to Production
Deploying Straight to Production
 
Deploying Straight to Production
Deploying Straight to ProductionDeploying Straight to Production
Deploying Straight to Production
 
Deploying Straight to Production
Deploying Straight to ProductionDeploying Straight to Production
Deploying Straight to Production
 
A Brief History of Elephpants
A Brief History of ElephpantsA Brief History of Elephpants
A Brief History of Elephpants
 
Aspects of love slideshare
Aspects of love slideshareAspects of love slideshare
Aspects of love slideshare
 
Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?
 
A Brief History of ElePHPants
A Brief History of ElePHPantsA Brief History of ElePHPants
A Brief History of ElePHPants
 
Coding Horrors
Coding HorrorsCoding Horrors
Coding Horrors
 
Anonymous classes2
Anonymous classes2Anonymous classes2
Anonymous classes2
 
Testing the Untestable
Testing the UntestableTesting the Untestable
Testing the Untestable
 
Anonymous Classes: Behind the Mask
Anonymous Classes: Behind the MaskAnonymous Classes: Behind the Mask
Anonymous Classes: Behind the Mask
 
Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?
 
Coding Horrors
Coding HorrorsCoding Horrors
Coding Horrors
 
Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?Does the SPL still have any relevance in the Brave New World of PHP7?
Does the SPL still have any relevance in the Brave New World of PHP7?
 
Giving birth to an ElePHPant
Giving birth to an ElePHPantGiving birth to an ElePHPant
Giving birth to an ElePHPant
 
A Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP GeneratorsA Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP Generators
 
A Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP GeneratorsA Functional Guide to Cat Herding with PHP Generators
A Functional Guide to Cat Herding with PHP Generators
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Php data structures – beyond spl (online version)

  • 1. PHP DataStructures – Beyond SPL A dreamscape made from random noise. Illustration: Google
  • 2. DataStructures A data structure is a particular way of organizing data in a computer so that it can be used efficiently. Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks.
  • 3. DataStructures in PHP • Some basic DataStructures available in PHP’s SPL • Stack • Queue • Heap • Doubly-Linked List • Fixed Array • SPL Object Storage • SPL is the Standard PHP Library • (Yet another recursive acronym)
  • 4. DataStructures • Some additional DataStructures that don’t exist in core PHP • Tries • QuadTrees
  • 6. Tries • A Tree structure comprising a hierarchy of “indexed” nodes • Each node can contain: • A series of pointers (keys) to the next node in the hierarchy • A bucket for data values • This allows for multiple values with the same key • There are three basic types of Tries: • Tries • Radix Tries • Suffix Tries
  • 7. Tries – Purpose • Fast lookup with a partial key • Example implementation https://github.com/MarkBaker/Tries
  • 8. Tries – Uses • Replacement for PHP Arrays (Hashmaps) • No key collisions • Duplicate Keys supported • No Hashing function required • Partial Key Lookups • Predictive Text • Autocomplete • Spell-Checking • Hyphen-isation
  • 9. Tries – Methods • add($key, $value = null) Adds new data to a Trie • search($prefix) Find data in a Trie • delete($key) • isNode($key) • isMember($key)
  • 10. Tries – Basic Trie • Node pointers comprise a single character or byte
  • 11. Tries – Basic Trie $trie = new Trie(); $trie->add('cat', 'cat data'); C A T
  • 12. Tries – Basic Trie $trie = new Trie(); $trie->add('cat', 'cat data'); $trie->add('car', 'car data'); C A T R
  • 13. Tries – Basic Trie $trie = new Trie(); $trie->add('cat', 'cat data'); $trie->add('car', 'car data'); $trie->add('cart', 'cart data'); C A T R T
  • 14. Tries – Basic Trie $trie = new Trie(); $trie->add('cat', 'cat data'); $trie->add('car', 'car data'); $trie->add('cart', 'cart data'); $trie->search('car'); T T C C A A R R
  • 15. Tries – Basic Trie • The key to a data node is inherent in the path to that node, so it is not necessary to store the key
  • 16. Tries – Radix Trie • Node pointers comprise one or more characters or bytes • This means they can be more compact and memory efficient than a basic Trie • It can add more overhead to building the Trie • It may be faster to search the Trie hierarchy
  • 17. Tries – Radix Trie $radixTrie = new RadixTrie(); $radixTrie->add('cat', 'cat data'); CAT
  • 18. Tries – Radix Trie $radixTrie = new Trie(); $radixTrie->add('cat', 'cat data'); $radixTrie->add('car', 'car data'); CA T R
  • 19. Tries – Radix Trie $radixTrie = new Trie(); $radixTrie->add('cat', 'cat data'); $radixTrie->add('car', 'car data'); $radixTrie->add('cart', 'cart data'); CA T R T
  • 20. Tries – Suffix Trie $suffixTrie = new SuffixTrie(); $suffixTrie->add('cat', 'cat data'); C A T
  • 21. Tries – Suffix Trie $suffixTrie = new SuffixTrie(); $suffixTrie->add('cat', 'cat data'); C A T TA T
  • 22. Tries – Suffix Trie $suffixTrie = new SuffixTrie(); $suffixTrie->add('cat', 'cat data'); $suffixTrie->search('at'); C A T T A T A T
  • 23. Tries – Suffix Tries • Memory hungry • n + n-1 + n-2… 2 + 1 nodes (where n is key length) used for every key/value stored in a Suffix Trie • Slow to populate • Can be used to search for “contains” rather than simply “begins with”
  • 24. Tries – Suffix Tries • It is necessary to store the key with the data • A search can return duplicate values • e.g. “banana” if we search for “a” or “n” or even “ana” • Data should only be stored once for the “full word”, and subsequent sequences should only store a pointer to that data
  • 26. QuadTrees • A Tree structure that partitions a 2-Dimensional space by recursively subdividing it into quadrants (or regions) • Each node can contain: • A series of pointers (keys) to the next node in the hierarchy • A bucket for data values • There are different types of QuadTrees: • Point QuadTrees • Region QuadTrees • Edge QuadTrees • Polygonal Map (PM) QuadTrees
  • 27. QuadTrees – Purpose • Fast Geo-spatial or Graph lookup • Sparse data compression • Example implementation https://github.com/MarkBaker/QuadTrees
  • 28. QuadTrees – Uses • Spatial Indexing • Storing Sparse Data e.g. • Spreadsheet format data • Pixel data in images • Collision Detection • Points within a field of vision
  • 29. QuadTrees – Methods • insert($xyCoordinate, $value = null) Adds new data to a QuadTree • search($boundingBox) Find data in a QuadTree
  • 30. QuadTrees – Point QuadTree • Used for Spatial Indexing
  • 31. QuadTrees – Spatial Indexing$quadTree = new QuadTree( -180, 90, 180, -90, // Dimensions 3 // Bucket size ); -90 90 0 -180 180
  • 32. $quadTree = new QuadTree( -180, 90, 180, -90, // Dimensions 3 // Bucket size ); $quadTree->add('London', 51.5072, -0.1275); $quadTree->add('New York', 40.7127, - 74.0059); $quadTree->add('Paris', 48.8567, 2.3508); QuadTrees – Spatial Indexing -90 90 0 -180 180
  • 33. QuadTrees – Spatial Indexing$quadTree = new QuadTree( -180, 90, 180, -90, // Dimensions 3 // Bucket size ); $quadTree->add('London', 51.5072, -0.1275); $quadTree->add('New York', 40.7127, - 74.0059); $quadTree->add('Paris', 48.8567, 2.3508); $quadTree->add('Munich', 48.1333, 11.5667); $quadTree->add('Dublin', 53.3478, 6.2597); $quadTree->add('Rome', 41.9000, 12.5000); $quadTree->add('Athens', 37.9667, 23.7167); -90 90 90 0 0 -180 -180 1800 0 45 90 0 45 180
  • 34. QuadTrees – Spatial Indexing$quadTree = new QuadTree( -180, 90, 180, -90, // Dimensions 3 // Bucket size ); $quadTree->add('London', 51.5072, -0.1275); $quadTree->add('New York', 40.7127, - 74.0059); $quadTree->add('Paris', 48.8567, 2.3508); $quadTree->add('Munich', 48.1333, 11.5667); $quadTree->add('Dublin', 53.3478, 6.2597); $quadTree->add('Rome', 41.9000, 12.5000); $quadTree->add('Athens', 37.9667, 23.7167); $quadTree->add('Amsterdam', 52.3667, 4.9000); -90 90 90 0 90 45 0 -180 -180 1800 0 45 90 0 45 180 0 90
  • 35. $quadTree = new QuadTree( -180, 90, 180, -90, // Dimensions 3 // Bucket size ); $quadTree->add('London', 51.5072, -0.1275); $quadTree->add('New York', 40.7127, - 74.0059); $quadTree->add('Paris', 48.8567, 2.3508); $quadTree->add('Munich', 48.1333, 11.5667); $quadTree->add('Dublin', 53.3478, 6.2597); $quadTree->add('Rome', 41.9000, 12.5000); $quadTree->add('Athens', 37.9667, 23.7167); $quadTree->add('Amsterdam', 52.3667, 4.9000); … // Search QuadTree for Northern Europe $quadTree->find( -15.0, 60.0, 25.0, 45.0 ); QuadTrees – Spatial Indexing -90 90 90 0 90 45 45 45 0 0 0 0 45 45 67.5 45 -45 0 -90 -180 180 -180 1800 0 0 180 90 0 45 0 90 0 90 90 180 0 45
  • 36. QuadTrees – Spatial Indexing • The top-level node need not be limited to the maximum graph space (i.e. the whole world)
  • 38. QuadTrees – Spatial Indexing • With a larger bucket size • QuadTree is smaller, fewer nodes using less memory • More points need checking in each node • Faster to insert / slower to search • With a smaller bucket size • The QuadTree uses more memory • Fewer points in each node to check • Slower to insert / faster to search
  • 39. QuadTrees – Region QuadTree • Used for Sparse-data Compression • Used for Level-based Aggregations
  • 40. QuadTrees – Image Compression
  • 41. QuadTrees • The same principles can be applied to 3-Dimensional space using an Octree
  • 42. PHP DataStructures – Beyond SPL A dreamscape made from random noise. Illustration: Google Questions ?
  • 43. Who am I? Mark Baker Design and Development Manager InnovEd (Innovative Solutions for Education) Learning Ltd Coordinator and Developer of: Open Source PHPOffice library PHPExcel, PHPWord, PHPPowerPoint, PHPProject, PHPVisio Minor contributor to PHP core Other small open source libraries available on github @Mark_Baker https://github.com/MarkBaker http://uk.linkedin.com/pub/mark-baker/b/572/171