SlideShare a Scribd company logo
1 of 80
Download to read offline
Persistent Data Structures

    Living in a world where nothing
   changes but everything evolves
                  - or -
A complete idiot's guide to immutability
Java                            Haskell


● Warm, soft and cute            ● Strange, unfamiliar alien
● Imperative                     ● Purely functional
● Object oriented                ● Everything is different
● Just like good old             ● Shocking news! It's not
  Basic, but with classes          like Basic!
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;

      final Node head, tail;

      void add(final Object v) {
        for (final Node n = head; n != null; n = {

   All fields, parameters and variables are automatically
 immutable, the final is implied everywhere, and there is no
                      way to get rid of it
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;
                                         It does for me!
      final Node head, tail;

      void add(final doesn't make
             But it Object v) {
        for (final Node n = head; n != null; n = {
        }         It won't work!

        All fields, parameters and variables are automatically
               immutable, the final is implied everywhere
What is a variable?

vary, varied, varying

 ● — verb (used with object)
Definition: to change or alter, as in form, appearance,
character, or substance

 ● — verb (used without object)
Definition: to undergo change in appearance, form, substance,
character, etc

 ● — synonyms:
modify, mutate
"Variables" in Haskell

 ● Must be assigned once declared

   YES: int a = 1;          NO: int a;

 ● Cannot be reassigned

   YES: final int a = 1; NO: a = 2;

These are mathematical variables, not imperative ones!
When everything is immutable

There is no notion of time:

 ● Functions take old values, produce new values, nothing is
   changed in-place
 ● It does not matter when a function was called, it only
   matters what arguments it was called with

There is no notion of identity:

 ● Everything is a value, complex data structures are values
 ● There is no way to tell if a == b, only if a.equals(b)
 ● In other words, values are never identical to each other, but
   may be equal
I want my linked list!

Basic terminology:

 ● Ephemeral data structure — everything that is not
   persistent. Most Java data structures (lists, sets, etc.) are

 ● Persistent data structure — immutable data structure with
   history. No in-place modifications. Operations on it create
   new versions. Older versions are always available. That. Is.

 ● The persistence property has nothing to do with persistent
   storage, like disks! This is a completely different story.
I want my linked list!

 ● In imperative languages, like Java, most data structures are
   ephemeral by default
Designing persistent data structures is somewhat awkward and
not always efficient

 ● In purely functional languages, like Haskell, all data
   structures are automatically persistent!
There is just no other way to make data structures
History of updates

      Making update to a persistent DS instance
always creates a new instance that contains this update.
         The current version is left unmodified.
Why should I bother?

      Is it fun? Hell yeah!

 But is it practical? Let's see!
The free lunch is over!
"The biggest sea change in software development
 since the OO revolution is knocking at the door,
  and its name is Concurrency." — Herb Sutter

                                      A commodity
                                       (my laptop)

The need for writing correct multi-threaded code
           is constantly increasing
Concurrent data structures are hard!

Want a concurrent ephemeral linked list?
Here are some implementation strategies:

 ● Coarse-grained synchronization
 ● Fine-grained synchronization
 ● Optimistic synchronization
 ● Lazy synchronization
All lock-based — no composition, deadlocks, etc

 ● Non-blocking synchronization in different flavors
And you need the size of a list you are in trouble!
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
  thread coordination within these structures

● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to reason about, test
  and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
                       Yes, but are persistent data
  thread coordination within these structures
                       structures actually simpler?
● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to test and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Just give up mutability!

● Persistent data structures are easy to reason about in
  concurrent environment

● The behavior does not depend on how many threads are
  trying to "modify" it at once

● Therefore persistent data structures are very easy to test
  and debug
The whole picture

 ● Persistent data structures alone are not sufficient
They are an essential part of the picture, but not the whole
answer to concurrency
 ● Inter-thread coordination is needed
Threads still need to know what each other thread is doing to
agree on a common outcome

 ● But it can be added "outside"
Which gives us complete separation of concerns
The whole picture

Solving concurrency challenge in a modern language:

 ● Scala Way — Persistent data structures with message

 ● Clojure Way — Persistent data structures with software
   transactional memory

 ● Will likely be mixed in the future
Last few words on concurrency

● Persistent data structures are slower than ephemeral ones
  in sequential use

● But not that much slower!

● We can forgive it, since they give you more functionality,
  and ephemeral data structures are simply less capable

● And in multiprocessor era, it is better to make things
  scalable rather than fast
Efficient persistent data structures

We want persistent data structures to be space and time

 ● Structural sharing
We want to reuse as many fragments of the previous version
as possible
 ● Path copying
We want to copy as few pieces as possible
 ● Maybe, just maybe lazy evaluation (where available)
We don't want nasty pathological cases
A case study

● Let's make some persistent data
  structures in Java

● All these structures consist of     Why are you
  classes with only final fields    looking at me?!

● With good amortized asymptotic
  complexity in most cases
Our plan

Lets start with some trivial examples

 ● Stack

 ● Queue

 ● Tree

The proceed with more advanced structures

 ● Hash Table

 ● Finger Tree
Trivial Example — Persistent Stack
class Stack<T> {
 final T v; (a)
 final Stack<T> next; (b)
                                         It's just a singly linked
 Stack() {                                      list of nodes
   v = null;
   next = null;
   size = 0;

 Stack(T v, Stack<T> next) {
   this.v = v; = next;

                               Source Code 1/2
Trivial Example — Persistent Stack
class Stack<T> {
 Stack<T> push(T v) {
   return new Stack<T>(v, this); (a)

 T peek() {
   if (next == null)
     throw new NoSuchElementException();
   return v; (b)

 Stack<T> pop() {
   if (next == null)
     throw new NoSuchElementException();
   return next; (c)

                                Source Code 2/2
Trivial Example — Persistent Stack

      Structural sharing in persistent stack
Trivial Example — Persistent Stack

      Looks familiar?
     The versions tree!
Trivial Example — Persistent Stack

    Also known as
   Spaghetti stack or
     Cactus stack
Persistent Queue

It's just two stacks combined:    When front stack is empty,
                                  reverse back stack and
 ● Back stack to enqueue items    use it as front stack
 ● Front stack to dequeue items
Persistent Queue
class Queue<T> {
 // back stack - push elements here
 final Stack<T> b; (a)
 // front stack - pop elements from here
 final Stack<T> f; (b)

 Queue() {
   b = f = new Stack<T>();

 Queue(Stack<T> b, Stack<T> f) {
   this.b = b;
   this.f = f;

 boolean isEmpty() {
   return f.isEmpty(); (c)

                              Source Code 1/3
Persistent Queue
class Queue<T> {
 static <T> Queue<T> check(Stack<T> b, Stack<T> f) {
   if (f.isEmpty())
     return new Queue<T>(f, b.reverse()); (a)
     return new Queue<T>(b, f); (b)

 Queue<T> push(T v) {
   return check(b.push(v), f);

 Queue<T> pop() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   return check(b, f.pop());

                                 Source Code 2/3
Persistent Queue
class Queue<T> {
 T peek() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   return f.peek();

class Stack<T> {
 Stack<T> reverse() {
   if (isEmpty() || next.isEmpty())
     return this;
   Stack<T> r = new Stack<T>();
   for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) {
     r = r.push(s.peek());
   return r;

                               Source Code 3/3
Persistent Queue

Structural sharing in persistent queue
Persistent Queue

Beware pathological cases!

 ● What is forward stack is empty, but back stack is full?

 ● And we are going to pop from the same queue N times

 ● Then we get N back back stack reversions!

 ● Lazy evaluation to the rescue — use lazy streams instead of
   strict stacks
Persistent Queue

                               But there is a better way
                                   to design queue!

Monoidally Annotated 2-3 Finger Tree is a versatile data
structure that can be used to build efficient lists, deques,
priority queues, interval trees, ropes, etc.

It is more complex, we will take a look at it later.
Persistent Tree

● It is trivial to convert any ephemeral tree to a persistent one
  by means of path copying

● It works for binary trees, 2-3 trees, B-trees, etc

● The shape of tree is not affected, only mutating algorithms

● In a balanced binary tree at most log N nodes need to be
  copied — quite efficient

● The secret to all persistent data structures is that they all
  are trees! (Yes, lists and hash tables are trees too)
Persistent Tree
Simple Persistent Binary Tree

class SimpleBinaryTree {
 static class Node {
   final K key; (a)
   final V value; (b)
   final Node l, r; (c)

   Node(K key, V value, Node l, Node r) {
     this.key = key;
     this.value = value;
     this.l = l;
     this.r = r;

                           Source Code 1/2
Simple Persistent Binary Tree

class SimpleBinaryTree {
 static Node insert(Node n, K key, V value) {
   if (n == null) {
     return new Node(key, value, null, null); (a)
   int cmp = key.compareTo(n.key); (b)
   if (cmp < 0) {
     return new Node(n.key, n.value, (c)
      insert(n.l, key, value), n.r);
   if (cmp > 0) {
     return new Node(n.key, n.value, (d)
      n.l, insert(n.r, key, value));
   return new Node(key, value, n.l, n.r); (e)

                            Source Code 2/2
Persistent Tree

Multiple definitions of persistence:

 ● Immutable data structure with history
 ● Committed to a persistent storage

Append only databases and file systems:

 ● CouchDB uses append only B-Tree
 ● RethinkDB makes append only variant of MySQL
 ● ZFS, BTRFS implement copy-on-write transactions
   and snapshots

Nothing is new under the moon!
Persistent Map

interface Map<K, V> {
  // get value for a key, or null if not found
  V get(K key);
  // make key/value association
  Map<K, V> put(K key, V value);
  // remove key/value association
  Map<K, V> remove(K key);

             Remember, no in-place updates
             Mutations create new instances
Persistent Map

Implementation Strategy

 ● Persistent red-black tree for ordered keys
   Time complexity — O(log n)

 ● Persistent hash table for hashable keys
   Time complexity — O(1)
Persistent Hash Table

But how do we implement it?
Copying the whole table would be too expensive!
Persistent Hash Table

Here's the idea: partition hash table into smaller
pieces, organized them as a persistent tree

Nice idea, but how do we navigate in such a tree?
Prefix Tree/Trie
Search is guided by individual letters of a string key

Hash code is just a string of digits!
Persistent Hash Table in Prefix Tree

Represent 32 bit hash codes as strings of 5 bit symbol:

hashCode = CAFEBABE16
level 6 5 4 3 2 1 0
bits 11 00101 01111 11101 01110 10101 11110
symbol 3 5 15 29 14 21 30
Persistent Hash Table

     hashCode = ... xxxxx xxxxx xxxxx xxxxx

Each item is either a key/value pair or a subtree
Persistent Hash Table

class PersistentHashMap {
 abstract class Item<K, V> {}

 class Node<K, V> extends Item<K, V> {
   final Item<K, V> children = new Item<K, V>[32]; (a)

 class Entry<K, V> extends Item<K, V> {
   final int hashCode; (b)
   final K key; (c)
   final V value; (d)
   final Entry<K, V> next; (e)

                        Source Code 1/2
Persistent Hash Table

class PersistentHashMap {
 V get(K key) {
   return root.find(key.hashCode(), key, 0); (a)

 class Node<K, V> extends Item<K, V> {
  V find(int hashCode, K key, int level) {
    int index = (hashCode >>> (level * 5)) & 31; (b)
    Item<K, V> item = children[index]; (c)
    if (item instanceof Node) { (d)
      return ((Node<K, V>) item) (e)
       .find(hashCode, key, level + 1);
    if (item instanceof Entry) { (f)
      return ((Entry<K, V>) item) (g)
       .find(hashCode, key);
    return null;

                          Source Code 2/2
Persistent Hash Table

Do not waste space!

      class PersistentHashMap {
       class Node<K, V> {
         final Item<K, V> children = new Item<K, V>[32]; (a)

 ● Most of the children would be null on deeper levels

 ● The number of arrays grows exponentially as we go deeper

 ● Need to find a way to compact tree

 ● Simply get rid of nulls in arrays!
Persistent Hash Table

    class Node<K, V> {
      final int mask; (a)
      final Item<K, V> children =
        new Item<K, V>[bitCount(mask)]; (b)

● Mask is a 32-bit integer whose bits set to 1 only for those
  array elements that are not null

● Array stores only non-null elements. Its size is the number
  of 1 bits in the mask. Array size varies from 2 to 32

● Overhead for null array element is just one bit. Quite good!
Persistent Hash Table

● To test that array has element at index i, simply test if ith bit
  in the mask is 1:

  if ((mask & (1 << i)) != 0) { ...

● To get offset to ith element in the array, count number of 1
  bits lower than i in the mask:

  int offset = bitCount(mask & ((1 << i) - 1));
  if (children[offset] instanceof ...
Persistent List

interface Seq<T> {
  T head(); // get first element
  Seq<T> tail(); // get list without first element
  Seq<T> cons(T v); // append element to head
  Seq<T> snoc(T v); // append element to tail
  Seq<T> concat(Seq<T> that); // join two lists
  int size(); // get number of elements
  T get(int index); // get Nth element
  Seq<T> set(int index, T v); // set Nth element

             Remember, no in-place updates
             Mutations create new instances
Persistent List

● There are quite a few ways to implement persistent lists

● But we will not be studying them

● Instead, we will turn our attention to finger trees

● Soon, it will be clear why
Finger Trees

● An incredibly elegant, simple and efficient data structure

● Oh so very versatile, functional programmer's Swiss Army

● Basic data structure for building random acces sequences,
  deques, priority queues, ropes, interval trees, etc.

● Let's define it in stages
Persistent leafy 2-3 trees

Let's begin with a simple data structure — leafy 2-3 tree

 ● Every intermediate node has either two childrent or three

 ● All values are stored in leafs

 ● Perfectly balanced — all leafs are at the same level
Persistent leafy 2-3 trees
Persistent leafy 2-3 trees

         Leafs contain interesting
           but what is stored in nodes?
Annotated leafy 2-3 trees

● There must be a way to find interesting values in a tree

● We need to guide search from the root of a tree to its leafs

● Let's add special annotations to nodes

● Use these annotations to find values
Size annotated leafy 2-3 trees

● Each intermediate node is annotated with the size of a
  subtree rooted at this node

● Makes it trivial to find any leaf by its index

● Starting from root, test if index is in the range of its left
  (middle) or right subtree, and repeat recursively for that
  subtree, until a leaf is found
Size annotated leafy 2-3 trees

     Looks like random access list
Priority annotated leafy 2-3 trees

● Each intermediate node is annotated with the highest
  priority of an element in its subtree

● Makes it trivial to find value with the highest priority

● Starting from root, find subtree with the highest priority
  descent recursively into it, until a leaf is found
Priority annotated leafy 2-3 trees

         Looks like priority queue

● One interface to unify size, priority (and more!) annotations
  on trees

● A set of values with a "zero" element 0 and a binary
  associative operation ⊕

● Monoid laws:
  0⊕a = a
  a⊕0 = a
  a⊕(b⊕c) = (a⊕b)⊕c
Monoid examples

● Strings with empty string and concatenation
  "" + "a" = "a", "a" + "" = "a"
  "a" + ("b" + "c") = ("a" + "b") + "c"

● Integers with zero and addition
  0 + 1 = 1, 1 + 0 = 1
  1 + (2 + 3) = (1 + 2) + 3

● Integers with one and multiplication
  1 * 2 = 2, 2 * 1 = 1
  2 * (3 * 4) = (2 * 3) * 4

● And many, more of them! (Monoids are everywhere)
Monoid interface

interface Monoid<T extends Monoid<T>> {
  T unit();
  T combine(T that);

class String implements Monoid<String> {

    String unit() {
      return ""; (a)

    String combine(String that) {
      return this + that; (b)
Size monoid

class Size implements Monoid<Size> {
 final int size; (a)

    Size(int size) {
      this.size = size;

    Size unit() {
      return new Size(0); (b)

    Size combine(Size that) {
      return new Size(this.size + that.size); (c)
Priority monoid

class Priority implements Monoid<Priority> {
 final int priority; (a)

    Priority(int priority) {
      this.priority = priority;

    Priority unit() {
      return new Priority(MAX_INTEGER); (b)

    Priority combine(Priority that) {
      return new Priority(
       Math.min(this.priority, that.priority)); (c)
But where do we get monoids from?

● Monoids have nice property of composability

● We can get more monoids by combining existing ones

● But where do we get initial monoids to begin with?

● We need a way to measure values!

● Those measures must be monoids, obviously
    interface Measured<M extends Monoid> {
      M measure();
Let's make a sketch of annotated tree
/** <V> is the type of values
   <M> is the type of monoidal measures of values */
class Tree<M extends Monoid, V extends Measured<M>>
   implements Measured<M> { (a)

 abstract class Leaf<M, V> extends Tree<M, V> {
   final V value; (b)
   override abstract M measure(); (c)

 class Node<M, V> extends Tree<M, V> {
  final Tree<M, V> left, right; (d)
  final M m; (e)
  Node(Tree<M, V> l, Tree<M, V> r) {
    left = l; right = r;
    m = l.measure().combine(r.measure()); (f)
  override final M measure() {                         Pseudocode!
    return m; (g)
Let's make a sketch of annotated tree
class Leaf<V> extends Tree<Size, V> {
  final V value;

    override final Size measure() {
      return new Size(1); (a)

class Leaf<V> extends Tree<Priority, V> {
  final V value;

    override final Priority measure() {
      return new Priority(value.priority()); (b)
But that is not finger tree yet!
Finger Tree

... is a just an annotated tree of annotated 2-3 trees!
Finger Tree

Digits, 2-3 trees, fingers and nested levels
Finger Tree

A little bit of Haskell would not hurt:

data Node v a = Node2 v a a | Node3 v a a a

data Digit v a = One v a
        | Two v a a
        | Three v a a a
        | Four v a a a a

data FingerTree v a = Empty
           | Single a
           | Deep v
             (Digit a) (a)
             (FingerTree v (Node v a)) (b)
             (Digit a) (c)
Finger Tree

class FingerTree<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {

 class Empty<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {}

 class Single<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final T v; (a)
  final M m; (b)

 class Deep<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final Digit<M, T> prefix; (c)
  final FingerTree<M, Node<M, T>> middle; (d)
  final Digit<M, T> suffix; (e)
  final M m; (f)

                                  Source Code 1/3
Finger Tree

class Digit<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class One<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a; (b)

 class Two<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b; (c)

 class Three<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c; (d)

 class Four<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c, d; (e)

                                  Source Code 2/3
Finger Tree

class Node<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class Node2<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b; (b)

 class Node3<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b, c; (c)

                                 Source Code 3/3
Finger Tree Interface

Basic operations:

 ● cons, snoc — append/prepend element
 ● concat — join two trees
 ● split — find prefix, element and suffix using predicate

Beyond the scope of this presentation, sorry
Finger Tree Performance

Amortized bounds:

              Finger Tree          2-3 Tree   List
 ● cons, snoc O(1)                 O(log n)   O(1)/O(n)
 ● head, last O(1)                 O(log n)   O(1)/O(n)
 ● concat     O(log min(ℓ1, ℓ2))   O(log n)   O(n)
 ● split      O(log min(n, ℓ-n))   O(log n)   O(n)
 ● index      O(log min(n, ℓ-n)    O(log n)   O(n)


More Related Content

Viewers also liked

Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structureeShikshak
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and AlgorithmDhaval Kaneria
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsAakash deep Singhal
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)Arvind Devaraj

Viewers also liked (7)

Data Structures for Robotic Learning
Data Structures for Robotic LearningData Structures for Robotic Learning
Data Structures for Robotic Learning
Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structure
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
Data Structure
Data StructureData Structure
Data Structure
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithms
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)

Similar to Immutable Data Structures Simply Explained with Java Examples

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresqueBret McGuire
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongMario Fusco
ParaSail  ParaSail
ParaSail AdaCore
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingMario Fusco
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersMiles Sabin
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersMiles Sabin
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersMiles Sabin
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015Fabrízio Mello
IntroductiontoprogramminginscalaAmuhinda Hungai
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapSrinivasan Raghvan
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxmary772
scalaliftoff2009.pdfHiroshi Ono
scalaliftoff2009.pdfHiroshi Ono
scalaliftoff2009.pdfHiroshi Ono

Similar to Immutable Data Structures Simply Explained with Java Examples (20)

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresque
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are Wrong
ParaSail  ParaSail
Yes scala can!
Yes scala can!Yes scala can!
Yes scala can!
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional Programming
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java Developers
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java Developers
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java Developers
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmap
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
Java best practices
Java best practicesJava best practices
Java best practices
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx

More from Vasil Remeniuk

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикVasil Remeniuk
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Vasil Remeniuk
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Vasil Remeniuk
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Vasil Remeniuk
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform researchVasil Remeniuk
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform ResearchVasil Remeniuk
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaVasil Remeniuk
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform ResearchVasil Remeniuk
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, PauliusVasil Remeniuk
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Vasil Remeniuk
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform ResearchVasil Remeniuk
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaVasil Remeniuk
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Vasil Remeniuk
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform ResearchVasil Remeniuk
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Vasil Remeniuk
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + ElkVasil Remeniuk
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхVasil Remeniuk

More from Vasil Remeniuk (20)

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и Программатик
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform research
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform Research
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform Research
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform Research
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius Valatka
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform Research
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + Elk
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событиях
ETL со Spark
ETL со SparkETL со Spark
ETL со Spark

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames

Immutable Data Structures Simply Explained with Java Examples

  • 1. Persistent Data Structures Living in a world where nothing changes but everything evolves - or - A complete idiot's guide to immutability
  • 2. Java Haskell vs ● Warm, soft and cute ● Strange, unfamiliar alien ● Imperative ● Purely functional ● Object oriented ● Everything is different ● Just like good old ● Shocking news! It's not Basic, but with classes like Basic!
  • 3. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } final Node head, tail; void add(final Object v) { for (final Node n = head; n != null; n = { ... } } } All fields, parameters and variables are automatically immutable, the final is implied everywhere, and there is no way to get rid of it
  • 4. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } It does for me! final Node head, tail; void add(final doesn't make But it Object v) { sense! for (final Node n = head; n != null; n = { ... } It won't work! } } All fields, parameters and variables are automatically immutable, the final is implied everywhere
  • 5. What is a variable? var·y/ˈve(ə)rē/ vary, varied, varying ● — verb (used with object) Definition: to change or alter, as in form, appearance, character, or substance ● — verb (used without object) Definition: to undergo change in appearance, form, substance, character, etc ● — synonyms: modify, mutate
  • 6. "Variables" in Haskell ● Must be assigned once declared YES: int a = 1; NO: int a; ● Cannot be reassigned YES: final int a = 1; NO: a = 2; These are mathematical variables, not imperative ones!
  • 7. When everything is immutable There is no notion of time: ● Functions take old values, produce new values, nothing is changed in-place ● It does not matter when a function was called, it only matters what arguments it was called with There is no notion of identity: ● Everything is a value, complex data structures are values too ● There is no way to tell if a == b, only if a.equals(b) ● In other words, values are never identical to each other, but may be equal
  • 8. I want my linked list! Basic terminology: ● Ephemeral data structure — everything that is not persistent. Most Java data structures (lists, sets, etc.) are ephemeral. ● Persistent data structure — immutable data structure with history. No in-place modifications. Operations on it create new versions. Older versions are always available. That. Is. Simple. ● The persistence property has nothing to do with persistent storage, like disks! This is a completely different story.
  • 9. I want my linked list! ● In imperative languages, like Java, most data structures are ephemeral by default Designing persistent data structures is somewhat awkward and not always efficient ● In purely functional languages, like Haskell, all data structures are automatically persistent! There is just no other way to make data structures
  • 10. History of updates Making update to a persistent DS instance always creates a new instance that contains this update. The current version is left unmodified.
  • 11. Why should I bother? Is it fun? Hell yeah! But is it practical? Let's see!
  • 12. The free lunch is over! "The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency." — Herb Sutter A commodity hardware (my laptop) The need for writing correct multi-threaded code is constantly increasing
  • 13. Concurrent data structures are hard! Want a concurrent ephemeral linked list? Here are some implementation strategies: ● Coarse-grained synchronization ● Fine-grained synchronization ● Optimistic synchronization ● Lazy synchronization All lock-based — no composition, deadlocks, etc ● Non-blocking synchronization in different flavors And you need the size of a list you are in trouble!
  • 14. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- thread coordination within these structures ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to reason about, test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 15. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- Yes, but are persistent data thread coordination within these structures structures actually simpler? ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 16. Just give up mutability! ● Persistent data structures are easy to reason about in concurrent environment ● The behavior does not depend on how many threads are trying to "modify" it at once ● Therefore persistent data structures are very easy to test and debug
  • 17. The whole picture ● Persistent data structures alone are not sufficient They are an essential part of the picture, but not the whole answer to concurrency ● Inter-thread coordination is needed Threads still need to know what each other thread is doing to agree on a common outcome ● But it can be added "outside" Which gives us complete separation of concerns
  • 18. The whole picture Solving concurrency challenge in a modern language: ● Scala Way — Persistent data structures with message passing ● Clojure Way — Persistent data structures with software transactional memory ● Will likely be mixed in the future
  • 19. Last few words on concurrency ● Persistent data structures are slower than ephemeral ones in sequential use ● But not that much slower! ● We can forgive it, since they give you more functionality, and ephemeral data structures are simply less capable ● And in multiprocessor era, it is better to make things scalable rather than fast
  • 20. Efficient persistent data structures We want persistent data structures to be space and time efficient: ● Structural sharing We want to reuse as many fragments of the previous version as possible ● Path copying We want to copy as few pieces as possible ● Maybe, just maybe lazy evaluation (where available) We don't want nasty pathological cases
  • 21. A case study ● Let's make some persistent data structures in Java ● All these structures consist of Why are you classes with only final fields looking at me?! ● With good amortized asymptotic complexity in most cases
  • 22. Our plan Lets start with some trivial examples ● Stack ● Queue ● Tree The proceed with more advanced structures ● Hash Table ● Finger Tree
  • 23. Trivial Example — Persistent Stack class Stack<T> { final T v; (a) final Stack<T> next; (b) It's just a singly linked Stack() { list of nodes v = null; next = null; size = 0; } Stack(T v, Stack<T> next) { this.v = v; = next; } ... Source Code 1/2
  • 24. Trivial Example — Persistent Stack class Stack<T> { ... Stack<T> push(T v) { return new Stack<T>(v, this); (a) } T peek() { if (next == null) throw new NoSuchElementException(); return v; (b) } Stack<T> pop() { if (next == null) throw new NoSuchElementException(); return next; (c) } Source Code 2/2
  • 25. Trivial Example — Persistent Stack Structural sharing in persistent stack
  • 26. Trivial Example — Persistent Stack Looks familiar? The versions tree!
  • 27. Trivial Example — Persistent Stack Also known as Spaghetti stack or Cactus stack
  • 28. Persistent Queue It's just two stacks combined: When front stack is empty, reverse back stack and ● Back stack to enqueue items use it as front stack ● Front stack to dequeue items
  • 29. Persistent Queue class Queue<T> { // back stack - push elements here final Stack<T> b; (a) // front stack - pop elements from here final Stack<T> f; (b) Queue() { b = f = new Stack<T>(); } Queue(Stack<T> b, Stack<T> f) { this.b = b; this.f = f; } boolean isEmpty() { return f.isEmpty(); (c) } ... Source Code 1/3
  • 30. Persistent Queue class Queue<T> { ... static <T> Queue<T> check(Stack<T> b, Stack<T> f) { if (f.isEmpty()) return new Queue<T>(f, b.reverse()); (a) else return new Queue<T>(b, f); (b) } Queue<T> push(T v) { return check(b.push(v), f); } Queue<T> pop() { if (isEmpty()) { throw new NoSuchElementException(); } return check(b, f.pop()); } Source Code 2/3
  • 31. Persistent Queue class Queue<T> { ... T peek() { if (isEmpty()) { throw new NoSuchElementException(); } return f.peek(); } class Stack<T> { ... Stack<T> reverse() { if (isEmpty() || next.isEmpty()) return this; Stack<T> r = new Stack<T>(); for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) { r = r.push(s.peek()); } return r; } Source Code 3/3
  • 32. Persistent Queue Structural sharing in persistent queue
  • 33. Persistent Queue Beware pathological cases! ● What is forward stack is empty, but back stack is full? ● And we are going to pop from the same queue N times ● Then we get N back back stack reversions! ● Lazy evaluation to the rescue — use lazy streams instead of strict stacks
  • 34. Persistent Queue But there is a better way to design queue! Monoidally Annotated 2-3 Finger Tree is a versatile data structure that can be used to build efficient lists, deques, priority queues, interval trees, ropes, etc. It is more complex, we will take a look at it later.
  • 35. Persistent Tree ● It is trivial to convert any ephemeral tree to a persistent one by means of path copying ● It works for binary trees, 2-3 trees, B-trees, etc ● The shape of tree is not affected, only mutating algorithms ● In a balanced binary tree at most log N nodes need to be copied — quite efficient ● The secret to all persistent data structures is that they all are trees! (Yes, lists and hash tables are trees too)
  • 37. Simple Persistent Binary Tree class SimpleBinaryTree { static class Node { final K key; (a) final V value; (b) final Node l, r; (c) Node(K key, V value, Node l, Node r) { this.key = key; this.value = value; this.l = l; this.r = r; } } ... Source Code 1/2
  • 38. Simple Persistent Binary Tree class SimpleBinaryTree { ... static Node insert(Node n, K key, V value) { if (n == null) { return new Node(key, value, null, null); (a) } int cmp = key.compareTo(n.key); (b) if (cmp < 0) { return new Node(n.key, n.value, (c) insert(n.l, key, value), n.r); } if (cmp > 0) { return new Node(n.key, n.value, (d) n.l, insert(n.r, key, value)); } return new Node(key, value, n.l, n.r); (e) } Source Code 2/2
  • 39. Persistent Tree Multiple definitions of persistence: ● Immutable data structure with history ● Committed to a persistent storage Append only databases and file systems: ● CouchDB uses append only B-Tree ● RethinkDB makes append only variant of MySQL ● ZFS, BTRFS implement copy-on-write transactions and snapshots Nothing is new under the moon!
  • 40. Persistent Map interface Map<K, V> { // get value for a key, or null if not found V get(K key); // make key/value association Map<K, V> put(K key, V value); // remove key/value association Map<K, V> remove(K key); } Remember, no in-place updates Mutations create new instances
  • 41. Persistent Map Implementation Strategy ● Persistent red-black tree for ordered keys Time complexity — O(log n) ● Persistent hash table for hashable keys Time complexity — O(1)
  • 42. Persistent Hash Table But how do we implement it? Copying the whole table would be too expensive!
  • 43. Persistent Hash Table Here's the idea: partition hash table into smaller pieces, organized them as a persistent tree Nice idea, but how do we navigate in such a tree?
  • 44. Prefix Tree/Trie Search is guided by individual letters of a string key Hash code is just a string of digits!
  • 45. Persistent Hash Table in Prefix Tree Represent 32 bit hash codes as strings of 5 bit symbol: hashCode = CAFEBABE16 level 6 5 4 3 2 1 0 bits 11 00101 01111 11101 01110 10101 11110 symbol 3 5 15 29 14 21 30
  • 46. Persistent Hash Table hashCode = ... xxxxx xxxxx xxxxx xxxxx Each item is either a key/value pair or a subtree
  • 47. Persistent Hash Table class PersistentHashMap { abstract class Item<K, V> {} class Node<K, V> extends Item<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } class Entry<K, V> extends Item<K, V> { final int hashCode; (b) final K key; (c) final V value; (d) final Entry<K, V> next; (e) } Source Code 1/2
  • 48. Persistent Hash Table class PersistentHashMap { V get(K key) { return root.find(key.hashCode(), key, 0); (a) } class Node<K, V> extends Item<K, V> { V find(int hashCode, K key, int level) { int index = (hashCode >>> (level * 5)) & 31; (b) Item<K, V> item = children[index]; (c) if (item instanceof Node) { (d) return ((Node<K, V>) item) (e) .find(hashCode, key, level + 1); } if (item instanceof Entry) { (f) return ((Entry<K, V>) item) (g) .find(hashCode, key); } return null; } Source Code 2/2
  • 49. Persistent Hash Table Do not waste space! class PersistentHashMap { class Node<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } ● Most of the children would be null on deeper levels ● The number of arrays grows exponentially as we go deeper ● Need to find a way to compact tree ● Simply get rid of nulls in arrays!
  • 50. Persistent Hash Table class Node<K, V> { final int mask; (a) final Item<K, V> children = new Item<K, V>[bitCount(mask)]; (b) } ● Mask is a 32-bit integer whose bits set to 1 only for those array elements that are not null ● Array stores only non-null elements. Its size is the number of 1 bits in the mask. Array size varies from 2 to 32 elements. ● Overhead for null array element is just one bit. Quite good!
  • 51. Persistent Hash Table ● To test that array has element at index i, simply test if ith bit in the mask is 1: if ((mask & (1 << i)) != 0) { ... ● To get offset to ith element in the array, count number of 1 bits lower than i in the mask: int offset = bitCount(mask & ((1 << i) - 1)); if (children[offset] instanceof ...
  • 52. Persistent List interface Seq<T> { T head(); // get first element Seq<T> tail(); // get list without first element Seq<T> cons(T v); // append element to head Seq<T> snoc(T v); // append element to tail Seq<T> concat(Seq<T> that); // join two lists int size(); // get number of elements T get(int index); // get Nth element Seq<T> set(int index, T v); // set Nth element } Remember, no in-place updates Mutations create new instances
  • 53. Persistent List ● There are quite a few ways to implement persistent lists ● But we will not be studying them ● Instead, we will turn our attention to finger trees ● Soon, it will be clear why
  • 54. Finger Trees ● An incredibly elegant, simple and efficient data structure ● Oh so very versatile, functional programmer's Swiss Army knife ● Basic data structure for building random acces sequences, deques, priority queues, ropes, interval trees, etc. ● Let's define it in stages
  • 55. Persistent leafy 2-3 trees Let's begin with a simple data structure — leafy 2-3 tree ● Every intermediate node has either two childrent or three children ● All values are stored in leafs ● Perfectly balanced — all leafs are at the same level
  • 57. Persistent leafy 2-3 trees Leafs contain interesting values, but what is stored in nodes?
  • 58. Annotated leafy 2-3 trees ● There must be a way to find interesting values in a tree ● We need to guide search from the root of a tree to its leafs ● Let's add special annotations to nodes ● Use these annotations to find values
  • 59. Size annotated leafy 2-3 trees ● Each intermediate node is annotated with the size of a subtree rooted at this node ● Makes it trivial to find any leaf by its index ● Starting from root, test if index is in the range of its left (middle) or right subtree, and repeat recursively for that subtree, until a leaf is found
  • 60. Size annotated leafy 2-3 trees Looks like random access list
  • 61. Priority annotated leafy 2-3 trees ● Each intermediate node is annotated with the highest priority of an element in its subtree ● Makes it trivial to find value with the highest priority ● Starting from root, find subtree with the highest priority descent recursively into it, until a leaf is found
  • 62. Priority annotated leafy 2-3 trees Looks like priority queue
  • 63. Monoids ● One interface to unify size, priority (and more!) annotations on trees ● A set of values with a "zero" element 0 and a binary associative operation ⊕ ● Monoid laws: 0⊕a = a a⊕0 = a a⊕(b⊕c) = (a⊕b)⊕c
  • 64. Monoid examples ● Strings with empty string and concatenation "" + "a" = "a", "a" + "" = "a" "a" + ("b" + "c") = ("a" + "b") + "c" ● Integers with zero and addition 0 + 1 = 1, 1 + 0 = 1 1 + (2 + 3) = (1 + 2) + 3 ● Integers with one and multiplication 1 * 2 = 2, 2 * 1 = 1 2 * (3 * 4) = (2 * 3) * 4 ● And many, more of them! (Monoids are everywhere)
  • 65. Monoid interface interface Monoid<T extends Monoid<T>> { T unit(); T combine(T that); } class String implements Monoid<String> { ... String unit() { return ""; (a) } String combine(String that) { return this + that; (b) } }
  • 66. Size monoid class Size implements Monoid<Size> { final int size; (a) Size(int size) { this.size = size; } Size unit() { return new Size(0); (b) } Size combine(Size that) { return new Size(this.size + that.size); (c) } }
  • 67. Priority monoid class Priority implements Monoid<Priority> { final int priority; (a) Priority(int priority) { this.priority = priority; } Priority unit() { return new Priority(MAX_INTEGER); (b) } Priority combine(Priority that) { return new Priority( Math.min(this.priority, that.priority)); (c) } }
  • 68. But where do we get monoids from? ● Monoids have nice property of composability ● We can get more monoids by combining existing ones ● But where do we get initial monoids to begin with? ● We need a way to measure values! ● Those measures must be monoids, obviously interface Measured<M extends Monoid> { M measure(); }
  • 69. Let's make a sketch of annotated tree /** <V> is the type of values <M> is the type of monoidal measures of values */ class Tree<M extends Monoid, V extends Measured<M>> implements Measured<M> { (a) abstract class Leaf<M, V> extends Tree<M, V> { final V value; (b) override abstract M measure(); (c) } class Node<M, V> extends Tree<M, V> { final Tree<M, V> left, right; (d) final M m; (e) Node(Tree<M, V> l, Tree<M, V> r) { left = l; right = r; m = l.measure().combine(r.measure()); (f) } override final M measure() { Pseudocode! return m; (g) }
  • 70. Let's make a sketch of annotated tree ... class Leaf<V> extends Tree<Size, V> { final V value; override final Size measure() { return new Size(1); (a) } } ... class Leaf<V> extends Tree<Priority, V> { final V value; override final Priority measure() { return new Priority(value.priority()); (b) } } Pseudocode!
  • 71. But that is not finger tree yet!
  • 72. Finger Tree ... is a just an annotated tree of annotated 2-3 trees!
  • 73. Finger Tree Digits, 2-3 trees, fingers and nested levels
  • 74. Finger Tree A little bit of Haskell would not hurt: data Node v a = Node2 v a a | Node3 v a a a data Digit v a = One v a | Two v a a | Three v a a a | Four v a a a a data FingerTree v a = Empty | Single a | Deep v (Digit a) (a) (FingerTree v (Node v a)) (b) (Digit a) (c)
  • 75. Finger Tree class FingerTree<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { class Empty<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> {} class Single<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final T v; (a) final M m; (b) class Deep<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final Digit<M, T> prefix; (c) final FingerTree<M, Node<M, T>> middle; (d) final Digit<M, T> suffix; (e) final M m; (f) Source Code 1/3
  • 76. Finger Tree class Digit<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class One<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a; (b) class Two<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b; (c) class Three<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c; (d) class Four<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c, d; (e) Source Code 2/3
  • 77. Finger Tree class Node<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class Node2<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b; (b) class Node3<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b, c; (c) Source Code 3/3
  • 78. Finger Tree Interface Basic operations: ● cons, snoc — append/prepend element ● concat — join two trees ● split — find prefix, element and suffix using predicate Beyond the scope of this presentation, sorry
  • 79. Finger Tree Performance Amortized bounds: Finger Tree 2-3 Tree List ● cons, snoc O(1) O(log n) O(1)/O(n) ● head, last O(1) O(log n) O(1)/O(n) ● concat O(log min(ℓ1, ℓ2)) O(log n) O(n) ● split O(log min(n, ℓ-n)) O(log n) O(n) ● index O(log min(n, ℓ-n) O(log n) O(n)