Your SlideShare is downloading. ×
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
SPL: The Undiscovered Library -  DataStructures
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

SPL: The Undiscovered Library - DataStructures

3,746

Published on

Slides from presentation given to the Brighton PHP group on 15th December 2014

Slides from presentation given to the Brighton PHP group on 15th December 2014

Published in: Technology, Education
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,746
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
43
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SPL The Undiscovered Library Exploring DataStructures
  • 2. Who am I? Mark Baker Design and Development Manager InnovEd (Innovative Solutions for Education) Ltd Coordinator and Developer of: Open Source PHPOffice library PHPExcel, PHPWord,PHPPowerPoint, PHPProject, PHPVisio Minor contributor to PHP core @Mark_Baker https://github.com/MarkBaker http://uk.linkedin.com/pub/mark-baker/b/572/171
  • 3. SPL – Standard PHP Library • SPL provides a standard set of interfaces for PHP5 • The aim of SPL is to implement some efficient data access interfaces and classes for PHP • Introduced with PHP 5.0.0 • Included as standard with PHP since version 5.3.0 • SPL DataStructures were added for version 5.3.0
  • 4. SPL DataStructures Dictionary DataStructures (Maps) • Fixed Arrays Linear DataStructures • Doubly-Linked Lists • Stacks • Queues Tree DataStructures • Heaps
  • 5. SPL DataStructures – Why use them? • Can improve performance • When the right structures are used in the right place • Can reduce memory usage • When the right structures are used in the right place • Already implemented and tested in PHP core • Saves work! • Can be type-hinted in function/method definitions • Adds semantics to your code
  • 6. SPL DataStructures Dictionary DataStructures (Maps) • Fixed Arrays Linear DataStructures Tree DataStructures
  • 7. Fixed Arrays • Predefined Size • Enumerated indexes only, not Associative • Indexed from 0 • Is an object • No hashing required for keys • Implements • Iterator • ArrayAccess • Countable
  • 8. Fixed Arrays – Uses • Returned Database resultsets, Record collections • Hours of Day • Days of Month/Year • Hotel Rooms, Airline seats As a 2-d fixed array
  • 9. Fixed Arrays – Big-O Complexity • Insert an element O(1) • Delete an element O(1) • Lookup an element O(1) • Resize a Fixed Array O(n)
  • 10. Fixed Arrays Standard Arrays SPLFixedArray Data Record 1 Key 12345 Data Record 2 Key 23456 Data Record 4 Key 34567 Data Record 3 Key 45678 [0] [1] [2] […] […] [12] [n-1] Hash Function Key 12345 Key 23456 Key 45678 Key 34567 Data Record 1 Key 0 Data Record 2 Key 1 Data Record 3 Key 2 Data Record 4 Key 3 [0] [1] [2] […] […] [12] [n-1] Key 0 Key 1 Key 2 Key 3
  • 11. Fixed Arrays $a = array(); for ($i = 0; $i < $size; ++$i) { $a[$i] = $i; } // Random/Indexed access for ($i = 0; $i < $size; ++$i) { $r = $a[$i]; } // Sequential access foreach($a as $v) { } // Sequential access with keys foreach($a as $k => $v) { } Initialise: 0.0000 s Set 1,000,000 Entries: 0.4671 s Read 1,000,000 Entries: 0.3326 s Iterate values for 1,000,000 Entries: 0.0436 s Iterate keys and values for 1,000,000 Entries: 0.0839 s Total Time: 0.9272 s Memory: 82,352.55 k
  • 12. Fixed Arrays $a = new SPLFixedArray($size); for ($i = 0; $i < $size; ++$i) { $a[$i] = $i; } // Random/Indexed access for ($i = 0; $i < $size; ++$i) { $r = $a[$i]; } // Sequential access foreach($a as $v) { } // Sequential access with keys foreach($a as $k => $v) { } Initialise: 0.0013 s Set 1,000,000 Entries: 0.3919 s Read 1,000,000 Entries: 0.3277 s Iterate values for 1,000,000 Entries: 0.1129 s Iterate keys and values for 1,000,000 Entries: 0.1531 s Total Time: 0.9869 s Memory: 35,288.41 k
  • 13. 0.0000 0.0100 0.0200 0.0300 0.0400 0.0500 0.0600 0.0700 Initialise (s) Set Values (s) Sequential Read (s) Random Read (s) Pop (s) Speed SPL Fixed Array Standard PHP Array Fixed Arrays 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Current Memory (k) Peak Memory (k) Memory Usage SPL Fixed Array Standard PHP Array
  • 14. Fixed Arrays • Faster direct access • Lower memory usage • Faster for random/indexed access than for sequential access
  • 15. Fixed Arrays – Gotchas • Can be extended, but at a cost in speed • Standard array functions won’t work with SPLFixedArray e.g. array_walk(), sort(), array_pop(), implode() • Avoid unsetting elements if possible • Unlike standard PHP enumerated arrays, this leaves empty nodes that trigger an Exception if accessed
  • 16. SPL DataStructures Dictionary DataStructures (Maps) Linear DataStructures • Doubly-Linked Lists • Stacks • Queues Tree DataStructures
  • 17. Doubly Linked Lists
  • 18. Doubly Linked Lists • Iterable Lists • Top to Bottom • Bottom to Top • Unindexed • Good for sequential access • Not good for random/indexed access • Implements • Iterator • ArrayAccess • Countable
  • 19. Doubly Linked Lists – Uses • Stacks • Queues • Most-recently used lists • Undo functionality • Trees • Memory Allocators • Fast dynamic, iterable arrays (not PHP’s hashed arrays) • iNode maps • Video frame queues
  • 20. Doubly Linked Lists – Big-O Complexity • Insert an element by index O(1) • Delete an element by index O(1) • Lookup by index O(n) • I have seen people saying that SPLDoublyLinkedList behaves like a hash table for lookups, which would make it O(1); but timing tests prove otherwise • Access a node at the beginning of the list O(1) • Access a node at the end of the list O(1)
  • 21. Doubly Linked Lists Head Tail A B C D E
  • 22. Doubly Linked Lists $a = array(); for ($i = 0; $i < $size; ++$i) { $a[$i] = $i; } // Random/Indexed access for ($i = 0; $i < $size; ++$i) { $r = $a[$i]; } // Sequential access for ($i = 0; $i < $size; ++$i) { $r = array_pop($a); } Initialise: 0.0000 s Set 100,000 Entries: 0.0585 s Read 100,000 Entries: 0.0378 s Pop 100,000 Entries: 0.1383 s Total Time: 0.2346 s Memory: 644.55 k Peak Memory: 8457.91 k
  • 23. Doubly Linked Lists $a = new SplDoublyLinkedList(); for ($i = 0; $i < $size; ++$i) { $a->push($i); } // Random/Indexed access for ($i = 0; $i < $size; ++$i) { $r = $a->offsetGet($i); } // Sequential access for ($i = $size-1; $i >= 0; --$i) { $a->pop(); } Initialise: 0.0000 s Set 100,000 Entries: 0.1514 s Read 100,000 Entries: 22.7068 s Pop 100,000 Entries: 0.1465 s Total Time: 23.0047 s Memory: 133.67 k Peak Memory: 5603.09 k
  • 24. Doubly Linked Lists • Fast for sequential access • Lower memory usage • Traversable in both directions • Size limited only by memory • Slow for random/indexed access • Insert into middle of list only available from PHP 5.5.0
  • 25. SPL DataStructures Dictionary DataStructures (Maps) Linear DataStructures • Doubly-Linked Lists • Stacks • Queues Tree DataStructures
  • 26. Stacks
  • 27. Stacks • Implemented as a Doubly-Linked List • LIFO • Last-In • First-Out • Essential Operations • push() • pop() • Optional Operations • count() • isEmpty() • peek()
  • 28. Stack – Uses • Undo mechanism (e.g. In text editors) • Backtracking (e.g. Finding a route through a maze) • Call Handler (e.g. Defining return location for nested calls) • Shunting Yard Algorithm (e.g. Converting Infix to Postfix notation) • Evaluating a Postfix Expression • Depth-First Search
  • 29. Stacks – Big-O Complexity • Push an element O(1) • Pop an element O(1)
  • 30. Stacks class StandardArrayStack { private $_stack = array(); public function count() { return count($this->_stack); } public function push($data) { $this->_stack[] = $data; } public function pop() { if (count($this->_stack) > 0) { return array_pop($this->_stack); } return NULL; } function isEmpty() { return count($this->_stack) == 0; } }
  • 31. Stacks $a = new StandardArrayStack(); for ($i = 1; $i <= $size; ++$i) { $a->push($i); } while (!$a->isEmpty()) { $i = $a->pop(); } PUSH 100,000 ENTRIES Push Time: 0.5818 s Current Memory: 8.75 POP 100,000 ENTRIES Pop Time: 1.6657 s Current Memory: 2.25 Total Time: 2.2488 s Current Memory: 2.25 Peak Memory: 8.75
  • 32. Stacks class StandardArrayStack { private $_stack = array(); private $_count = 0; public function count() { return $this->_count; } public function push($data) { ++$this->_count; $this->_stack[] = $data; } public function pop() { if ($this->_count > 0) { --$this->_count; return array_pop($this->_stack); } return NULL; } function isEmpty() { return $this->_count == 0; } }
  • 33. Stacks $a = new StandardArrayStack(); for ($i = 1; $i <= $size; ++$i) { $a->push($i); } while (!$a->isEmpty()) { $i = $a->pop(); } PUSH 100,000 ENTRIES Push Time: 0.5699 s Current Memory: 8.75 POP 100,000 ENTRIES Pop Time: 1.1005 s Current Memory: 1.75 Total Time: 1.6713 s Current Memory: 1.75 Peak Memory: 8.75
  • 34. Stacks $a = new SPLStack(); for ($i = 1; $i <= $size; ++$i) { $a->push($i); } while (!$a->isEmpty()) { $i = $a->pop(); } PUSH 100,000 ENTRIES Push Time: 0.4301 s Current Memory: 5.50 POP 100,000 ENTRIES Pop Time: 0.6413 s Current Memory: 0.75 Total Time: 1.0723 s Current Memory: 0.75 Peak Memory: 5.50
  • 35. Stacks 0.0796 0.0782 0.0644 0.1244 0.0998 0.0693 8.75 8.75 5.50 0 1 2 3 4 5 6 7 8 9 10 0.0000 0.0200 0.0400 0.0600 0.0800 0.1000 0.1200 0.1400 StandardArrayStack StandardArrayStack2 SPLStack Memory(MB) Time(seconds) Stack Timings Push Time (s) Pop Time (s) Memory after Push (MB)
  • 36. Stacks – Gotchas • Peek (view an entry from the middle of the stack) • StandardArrayStack public function peek($n = 0) { if ((count($this->_stack) - $n) < 0) { return NULL; } return $this->_stack[count($this->_stack) - $n - 1]; } • StandardArrayStack2 public function peek($n = 0) { if (($this->_count - $n) < 0) { return NULL; } return $this->_stack[$this->_count - $n - 1]; } • SPLStack $r = $a->offsetGet($n);
  • 37. Stacks – Gotchas 0.0075 0.0077 0.0064 0.0111 0.0078 0.1627 0.0124 0.0098 0.0066 1.00 1.00 0.75 0.00 0.20 0.40 0.60 0.80 1.00 1.20 0.0000 0.0200 0.0400 0.0600 0.0800 0.1000 0.1200 0.1400 0.1600 0.1800 StandardArrayStack StandardArrayStack2 SPLStack Memory(MB) Time(seconds) Stack Timings Push Time (s) Peek Time (s) Pop Time (s) Memory after Push (MB)
  • 38. Stacks – Gotchas • Peek When looking through the stack, SPLStack has to follow each link in the “chain” until it finds the nth entry
  • 39. SPL DataStructures Dictionary DataStructures (Maps) Linear DataStructures • Doubly-Linked Lists • Stacks • Queues Tree DataStructures
  • 40. Queues
  • 41. Queues • Implemented as a Doubly-Linked List • FIFO • First-In • First-Out • Essential Operations • enqueue() • dequeue() • Optional Operations • count() • isEmpty() • peek()
  • 42. Queues – Uses • Job/print/message submissions • Breadth-First Search • Request handling (e.g. a Web server)
  • 43. Queues – Big-O Complexity • Enqueue an element O(1) • Dequeue an element O(1)
  • 44. Queues class StandardArrayQueue { private $_queue = array(); private $_count = 0; public function count() { return $this->_count; } public function enqueue($data) { ++$this->_count; $this->_queue[] = $data; } public function dequeue() { if ($this->_count > 0) { --$this->_count; return array_shift($this->_queue); } return NULL; } function isEmpty() { return $this->_count == 0; } }
  • 45. Queues $a = new StandardArrayQueue(); for ($i = 1; $i <= $size; ++$i) { $a->enqueue($i); } while (!$a->isEmpty()) { $i = $a->dequeue(); } ENQUEUE 100,000 ENTRIES Enqueue Time: 0.6884 Current Memory: 8.75 DEQUEUE 100,000 ENTRIES Dequeue Time: 335.8434 Current Memory: 1.75 Total Time: 336.5330 Current Memory: 1.75 Peak Memory: 8.75
  • 46. Queues $a = new SPLQueue(); for ($i = 1; $i <= $size; ++$i) { $a->enqueue($i); } while (!$a->isEmpty()) { $i = $a->dequeue(); } ENQUEUE 100,000 ENTRIES Enqueue Time: 0.4087 Current Memory: 5.50 DEQUEUE 100,000 ENTRIES Dequeue Time: 0.6148 Current Memory: 0.75 Total Time: 1.0249 Current Memory: 0.75 Peak Memory: 5.50
  • 47. Queues 0.0075 0.0080 0.00640.0087 0.0070 0.1582 0.6284 0.6277 0.0066 1.00 1.00 0.75 0.00 0.20 0.40 0.60 0.80 1.00 1.20 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 StandardArrayQueue StandardArrayQueue2 SPLQueue Memory(MB) Time(seconds) Queue Timings Enqueue Time (s) Peek Time (s) Dequeue Time (s) Memory after Enqueue (MB)
  • 48. Queues – Gotchas • Dequeue In standard PHP enumerated arrays, shift() and unshift() are expensive operations because they re-index the entire array This problem does not apply to SPLQueue • Peek When looking through the queue, SPLQueue has to follow each link in the “chain” until it finds the nth entry
  • 49. SPL DataStructures Dictionary DataStructures (Maps) Linear DataStructures Tree DataStructures • Heaps
  • 50. Heaps  
  • 51. Heaps • Ordered Lists • Random Input • Ordered Output • Implemented as a binary tree structure • Essential Operations • Insert • Extract • Ordering Rule • Abstract that requires extending with the implementation of a compare() algorithm • compare() is reversed in comparison with usort compare callbacks • Partially sorted on data entry
  • 52. Heaps – Uses • Heap sort • Selection algorithms (e.g. Max, Min, Median) • Graph algorithms • Prim’s Minimal Spanning Tree (connected weighted undirected graph) • Dijkstra’s Shortest Path (network or traffic routing) • Priority Queues
  • 53. Heaps – Big-O Complexity • Insert an element O(log n) • Delete an element O(log n) • Access root element O(1)
  • 54. Heaps
  • 55. Heaps class ExtendedSPLHeap extends SPLHeap { protected function compare($a, $b) { if ($a->latitude == $b->latitude) { return 0; } return ($a->latitude < $b->latitude) ? -1 : 1; } } $citiesHeap = new ExtendedSPLHeap(); $file = new SplFileObject("cities.csv"); $file->setFlags( SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY ); while (!$file->eof()) { $cityData = $file->fgetcsv(); if ($cityData !== NULL) { $city = new StdClass; $city->name = $cityData[0]; $city->latitude = $cityData[1]; $city->longitude = $cityData[2]; $citiesHeap->insert($city); } }
  • 56. Heaps echo 'There are ', $citiesHeap->count(), ' cities in the heap', PHP_EOL; echo 'FROM NORTH TO SOUTH', PHP_EOL; foreach($citiesHeap as $city) { echo sprintf( "%-20s %+3.4f %+3.4f" . PHP_EOL, $city->name, $city->latitude, $city->longitude ); } echo 'There are ', $citiesHeap->count(), ' cities in the heap', PHP_EOL;
  • 57. Heaps echo 'There are ', $citiesHeap->count(), ' cities in the heap', PHP_EOL; echo 'FROM NORTH TO SOUTH', PHP_EOL; foreach($citiesHeap as $city) { echo sprintf( "%-20s %+3.4f %+3.4f" . PHP_EOL, $city->name, $city->latitude, $city->longitude ); } echo 'There are ', $citiesHeap->count(), ' cities in the heap', PHP_EOL; There are 69 cities in the heap FROM NORTH TO SOUTH Inverness +57.4717 -4.2254 Aberdeen +57.1500 -2.1000 Dundee +56.4500 -2.9833 Perth +56.3954 -3.4353 Stirling +56.1172 -3.9397 Edinburgh +55.9500 -3.2200 Glasgow +55.8700 -4.2700 Derry +54.9966 -7.3086 Newcastle upon Tyne +54.9833 -1.5833 Carlisle +54.8962 -2.9316 Sunderland +54.8717 -1.4581 Durham +54.7771 -1.5607 Belfast +54.6000 -5.9167 Lisburn +54.5097 -6.0374 Armagh +54.2940 -6.6659 Newry +54.1781 -6.3357 Ripon +54.1381 -1.5223
  • 58. Heaps class ExtendedSPLHeap extends SPLHeap { const NORTH_TO_SOUTH = 'north_to_south'; const SOUTH_TO_NORTH = 'south_to_north'; const EAST_TO_WEST = 'east_to_west'; const WEST_TO_EAST = 'west_to_east'; protected $_sortSequence = self::NORTH_TO_SOUTH; protected function compare($a, $b) { switch($this->_sortSequence) { case self::NORTH_TO_SOUTH : if ($a->latitude == $b->latitude) return 0; return ($a->latitude < $b->latitude) ? -1 : 1; case self::SOUTH_TO_NORTH : if ($a->latitude == $b->latitude) return 0; return ($b->latitude < $a->latitude) ? -1 : 1; case self::EAST_TO_WEST : if ($a->longitude == $b->longitude) return 0; return ($a->longitude < $b->longitude) ? -1 : 1; case self::WEST_TO_EAST : if ($a->longitude == $b->longitude) return 0; return ($b->longitude < $a->longitude) ? -1 : 1; } } public function setSortSequence( $sequence = self::NORTH_TO_SOUTH ) { $this->_sortSequence = $sequence; } } $sortSequence = ExtendedSPLHeap::WEST_TO_EAST; $citiesHeap = new ExtendedSPLHeap(); $citiesHeap->setSortSequence($sortSequence); $file = new SplFileObject("cities.csv"); $file->setFlags( SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY ); while (!$file->eof()) { $cityData = $file->fgetcsv(); if ($cityData !== NULL) { $city = new StdClass; $city->name = $cityData[0]; $city->latitude = $cityData[1]; $city->longitude = $cityData[2]; $citiesHeap->insert($city); } }
  • 59. Heaps class ExtendedSPLHeap extends SPLHeap { protected $_longitude = 0; protected $_latitude = 0; protected function compare($a, $b) { if ($a->distance == $b->distance) return 0; return ($a->distance > $b->distance) ? -1 : 1; } public function setLongitude($longitude) { $this->_longitude = $longitude; } public function setLatitude($latitude) { $this->_latitude = $latitude; } ….. public function insert($value) { $value->distance = $this->_calculateDistance($value); parent::insert($value); } } $citiesHeap = new ExtendedSPLHeap(); // Latitude and Longitude for Brighton $citiesHeap->setLatitude(50.8300); $citiesHeap->setLongitude(-0.1556); $file = new SplFileObject("cities.csv"); $file->setFlags( SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY ); while (!$file->eof()) { $cityData = $file->fgetcsv(); if ($cityData !== NULL) { $city = new StdClass; $city->name = $cityData[0]; $city->latitude = $cityData[1]; $city->longitude = $cityData[2]; $citiesHeap->insert($city); } }
  • 60. Heaps – Gotchas • Compare method is reversed logic from a usort() callback • Traversing the heap removes elements from the heap
  • 61. SPL – Standard PHP Library E-Book Mastering the SPL Library Joshua Thijssen Available in PDF, ePub, Mobi http://www.phparch.com/books/mastering-the-spl-library/
  • 62. SPL DataStructures ? Questions

×