Your SlideShare is downloading. ×

Pf congres20110917 data-structures

1,389

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,389
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SPL Data Structures and their Complexity Jurri¨n Stutterheim e September 17, 20111
  • 2. 1. Introduction2
  • 3. This presentation §1 Understand what data structures are How they are represented internally How “fast” each one is and why that is3
  • 4. Data structures §1 Classes that offer the means to store and retrieve data, possibly in a particular order Implementation is (often) optimised for certain use cases array is PHP’s oldest and most frequently used data structure PHP 5.3 adds support for several others4
  • 5. Current SPL data structures §1 SplDoublyLinkedList SplStack SplQueue SplHeap SplMaxHeap SplMinHeap SplPriorityQueue SplFixedArray SplObjectStorage5
  • 6. Why care? §1 Using the right data structure in the right place could improve performance Already implemented and tested: saves work Can add a type hint in a function definition Adds semantics to your code6
  • 7. Algorithmic complexity §1 We want to be able to talk about the performance of the data structure implementation Running speed (time complexity) Space consumption (space complexity) We describe complexity in terms of input size, which is machine and programming language independent7
  • 8. Example §1 for ($i = 0; $i < $n; $i++) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } For some n, how many times is “tick” printed? I.e. what is the time complexity of this algorithm?8
  • 9. Example §1 for ($i = 0; $i < $n; $i++) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } For some n, how many times is “tick” printed? I.e. what is the time complexity of this algorithm? n2 times8
  • 10. Talking about complexity §1 Pick a function to act as boundary for the algorithm’s complexity Worst-case Denoted O (big-Oh) “My algorithm will not be slower than this function” Best-case Denoted Ω (big-Omega) “My algorithm will at least be as slow as this function” If they are the same, we write Θ (big-Theta) In example: both cases are n2 , so the algorithm is in Θ(n2 )9
  • 11. Visualized §110
  • 12. Example 2 §1 for ($i = 0; $i < $n; $i++) { if ($myBool) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } } What is the time complexity of this algorithm?11
  • 13. Example 2 §1 for ($i = 0; $i < $n; $i++) { if ($myBool) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } } What is the time complexity of this algorithm? O(n2 ) Ω(n) (if $myBool is false) No Θ!11
  • 14. We can be a bit sloppy §1 for ($i = 0; $i < $n; $i++) { if ($myBool) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } } We describe algorithmic behaviour as input size grows to infinity constant factors and smaller terms don’t matter too much E.g. 3n2 + 4n + 1 is in O(n2 )12
  • 15. Other functions §1 for ($i = 0; $i < $n; $i++) { for ($j = 0; $j < $n; $j++) { echo ’tick’; } } for ($i = 0; $i < $n; $i++) { echo ’tock’; } This algorithm is still in Θ(n2 ).13
  • 16. Bounds §1 Figure: Order relations1 114 Taken from Cormen et al. 2009
  • 17. Complexity Comparison §1 3 10 Superexponential Factorial Exponential Quadratic 102 Linear 1 10 Logarithmic 100 101 Constant: 1, logarithmic: lg n, linear: n, quadratic: n2 , exponential: 2n , factorial: n!, super-exponential: nn15
  • 18. In numbers §1 Approximate growth for n = 50: 1 1 lg n 5.64 n 50 n2 2500 n3 12500 2n 1125899906842620 n! 3.04 ∗ 1064 nn 8.88 ∗ 108416
  • 19. Some more notes on complexity §1 Constant time is written 1, but goes for any constant c Polynomial time contains all functions in nc for some constant c Everything in this presentation will be in polynomial time17
  • 20. 2. SPL Data Structures18
  • 21. Credit where credit is due §2 The first three pictures in this section are from Wikipedia19
  • 22. SplDoublyLinkedList §2 12 99 37 Superclass of SplStack and SplQueue SplDoublyLinkedList is not truly a doubly linked list; it behaves like a hashtable20
  • 23. SplDoublyLinkedList §2 12 99 37 Superclass of SplStack and SplQueue SplDoublyLinkedList is not truly a doubly linked list; it behaves like a hashtable Usual doubly linked list time complexity Append/prepend to available node in Θ(1) Lookup by scanning in O(n) Access to beginning/end in Θ(1)20
  • 24. SplDoublyLinkedList §2 12 99 37 Superclass of SplStack and SplQueue SplDoublyLinkedList is not truly a doubly linked list; it behaves like a hashtable Usual doubly linked list time complexity Append/prepend to available node in Θ(1) Lookup by scanning in O(n) Access to beginning/end in Θ(1) SplDoublyLinkedList time complexity Insert/delete by index in Θ(1) Lookup by index in Θ(1) Access to beginning/end in Θ(1)20
  • 25. SplStack §2 Subclass of SplDoublyLinkedList; adds no new operations Last-in, first-out (LIFO) Pop/push value from/on the top of the stack in Θ(1) Push Pop21
  • 26. SplQueue §2 Subclass of SplDoublyLinkedList; adds enqueue/dequeue operations First-in, first-out (FIFO) Read/dequeue element from front in Θ(1) Enqueue element to the end in Θ(1) Enqueue Dequeue22
  • 27. Short excursion: trees §2 100 19 36 17 3 25 1 2 7 Consists of nodes (vertices) and directed edges Each node always has in-degree 1 Except the root: always in-degree 0 Previous property implies there are no cycles Binary tree: each node has at most two child-nodes23
  • 28. SplHeap, SplMaxHeap and SplMinHeap §2 100 19 36 17 3 25 1 2 7 A heap is a tree with the heap property : for all A and B, if B is a child node of A, then val(A) val(B) for a max-heap: SplMaxHeap val(A) val(B) for a min-heap: SplMinHeap Where val(A) denotes the value of node A24
  • 29. Heaps contd. §2 SplHeap is an abstract superclass Implemented as binary tree Access to root element in Θ(1) Insertion/deletion in O(lg n)25
  • 30. SplPriorityQueue §2 Variant of SplMaxHeap: for all A and B, if B is a child node of A, then prio(A) prio(B) Where prio(A) denotes the priority of node A26
  • 31. SplFixedArray §2 Fixed-size array with numerical indices only Efficient OO array implementation No hashing required for keys Can make assumptions about array size Lookup, insertion, deletion in Θ(1) time Resize in Θ(n)27
  • 32. SplObjectStorage §2 Storage container for objects Insertion, deletion in Θ(1) Verification of presence in Θ(1) Missing: set operations Union, intersection, difference, etc.28
  • 33. 3. Concluding29
  • 34. Missing in PHP §3 Set data structure Map/hashtable data structure Does SplDoublyLinkedList satisfy this use case? If yes: split it in two separate structures and make SplDoublyLinkedList a true doubly linked list Immutable data structures Allows us to more easily emulate “pure” functions Less bugs in your code due to lack of mutable state30
  • 35. Closing remarks §3 Use the SPL data structures! Choose them with care Reason about your code’s complexity31
  • 36. Questions §3 Questions?32

×