SlideShare a Scribd company logo
1 of 80
Download to read offline
Persistent Data Structures

    Living in a world where nothing
   changes but everything evolves
                  - or -
A complete idiot's guide to immutability
Java                            Haskell



                            vs




● Warm, soft and cute            ● Strange, unfamiliar alien
● Imperative                     ● Purely functional
● Object oriented                ● Everything is different
● Just like good old             ● Shocking news! It's not
  Basic, but with classes          like Basic!
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;
   }

      final Node head, tail;

      void add(final Object v) {
        for (final Node n = head; n != null; n = n.next) {
        ...
        }
      }
  }


   All fields, parameters and variables are automatically
 immutable, the final is implied everywhere, and there is no
                      way to get rid of it
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;
   }
                                         It does for me!
      final Node head, tail;

      void add(final doesn't make
             But it Object v) {
                      sense!
        for (final Node n = head; n != null; n = n.next) {
        ...
        }         It won't work!
      }
  }


        All fields, parameters and variables are automatically
               immutable, the final is implied everywhere
What is a variable?

var·y/ˈve(ə)rē/
vary, varied, varying

 ● — verb (used with object)
Definition: to change or alter, as in form, appearance,
character, or substance

 ● — verb (used without object)
Definition: to undergo change in appearance, form, substance,
character, etc

 ● — synonyms:
modify, mutate
"Variables" in Haskell

 ● Must be assigned once declared

   YES: int a = 1;          NO: int a;

 ● Cannot be reassigned

   YES: final int a = 1; NO: a = 2;

These are mathematical variables, not imperative ones!
When everything is immutable

There is no notion of time:

 ● Functions take old values, produce new values, nothing is
   changed in-place
 ● It does not matter when a function was called, it only
   matters what arguments it was called with

There is no notion of identity:

 ● Everything is a value, complex data structures are values
   too
 ● There is no way to tell if a == b, only if a.equals(b)
 ● In other words, values are never identical to each other, but
   may be equal
I want my linked list!

Basic terminology:

 ● Ephemeral data structure — everything that is not
   persistent. Most Java data structures (lists, sets, etc.) are
   ephemeral.

 ● Persistent data structure — immutable data structure with
   history. No in-place modifications. Operations on it create
   new versions. Older versions are always available. That. Is.
   Simple.

 ● The persistence property has nothing to do with persistent
   storage, like disks! This is a completely different story.
I want my linked list!

 ● In imperative languages, like Java, most data structures are
   ephemeral by default
Designing persistent data structures is somewhat awkward and
not always efficient

 ● In purely functional languages, like Haskell, all data
   structures are automatically persistent!
There is just no other way to make data structures
History of updates




      Making update to a persistent DS instance
always creates a new instance that contains this update.
         The current version is left unmodified.
Why should I bother?

      Is it fun? Hell yeah!




 But is it practical? Let's see!
The free lunch is over!
"The biggest sea change in software development
 since the OO revolution is knocking at the door,
  and its name is Concurrency." — Herb Sutter


                                      A commodity
                                        hardware
                                       (my laptop)




The need for writing correct multi-threaded code
           is constantly increasing
Concurrent data structures are hard!

Want a concurrent ephemeral linked list?
Here are some implementation strategies:

 ● Coarse-grained synchronization
 ● Fine-grained synchronization
 ● Optimistic synchronization
 ● Lazy synchronization
All lock-based — no composition, deadlocks, etc

 ● Non-blocking synchronization in different flavors
And you need the size of a list you are in trouble!
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
  thread coordination within these structures

● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to reason about, test
  and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
                       Yes, but are persistent data
  thread coordination within these structures
                       structures actually simpler?
● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to test and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Just give up mutability!

● Persistent data structures are easy to reason about in
  concurrent environment

● The behavior does not depend on how many threads are
  trying to "modify" it at once

● Therefore persistent data structures are very easy to test
  and debug
The whole picture

 ● Persistent data structures alone are not sufficient
They are an essential part of the picture, but not the whole
answer to concurrency
 ● Inter-thread coordination is needed
Threads still need to know what each other thread is doing to
agree on a common outcome

 ● But it can be added "outside"
Which gives us complete separation of concerns
The whole picture

Solving concurrency challenge in a modern language:

 ● Scala Way — Persistent data structures with message
   passing

 ● Clojure Way — Persistent data structures with software
   transactional memory

 ● Will likely be mixed in the future
Last few words on concurrency

● Persistent data structures are slower than ephemeral ones
  in sequential use

● But not that much slower!

● We can forgive it, since they give you more functionality,
  and ephemeral data structures are simply less capable

● And in multiprocessor era, it is better to make things
  scalable rather than fast
Efficient persistent data structures

We want persistent data structures to be space and time
efficient:

 ● Structural sharing
We want to reuse as many fragments of the previous version
as possible
 ● Path copying
We want to copy as few pieces as possible
 ● Maybe, just maybe lazy evaluation (where available)
We don't want nasty pathological cases
A case study

● Let's make some persistent data
  structures in Java

● All these structures consist of     Why are you
  classes with only final fields    looking at me?!

● With good amortized asymptotic
  complexity in most cases
Our plan

Lets start with some trivial examples

 ● Stack

 ● Queue

 ● Tree

The proceed with more advanced structures

 ● Hash Table

 ● Finger Tree
Trivial Example — Persistent Stack
class Stack<T> {
 final T v; (a)
 final Stack<T> next; (b)
                                         It's just a singly linked
 Stack() {                                      list of nodes
   v = null;
   next = null;
   size = 0;
 }

 Stack(T v, Stack<T> next) {
   this.v = v;
   this.next = next;
 }
 ...




                               Source Code 1/2
Trivial Example — Persistent Stack
class Stack<T> {
 ...
 Stack<T> push(T v) {
   return new Stack<T>(v, this); (a)
 }

 T peek() {
   if (next == null)
     throw new NoSuchElementException();
   return v; (b)
 }

 Stack<T> pop() {
   if (next == null)
     throw new NoSuchElementException();
   return next; (c)
 }




                                Source Code 2/2
Trivial Example — Persistent Stack




      Structural sharing in persistent stack
Trivial Example — Persistent Stack


      Looks familiar?
     The versions tree!
Trivial Example — Persistent Stack



    Also known as
   Spaghetti stack or
     Cactus stack
Persistent Queue




It's just two stacks combined:    When front stack is empty,
                                  reverse back stack and
 ● Back stack to enqueue items    use it as front stack
 ● Front stack to dequeue items
Persistent Queue
class Queue<T> {
 // back stack - push elements here
 final Stack<T> b; (a)
 // front stack - pop elements from here
 final Stack<T> f; (b)

 Queue() {
   b = f = new Stack<T>();
 }

 Queue(Stack<T> b, Stack<T> f) {
   this.b = b;
   this.f = f;
 }

 boolean isEmpty() {
   return f.isEmpty(); (c)
 }
 ...


                              Source Code 1/3
Persistent Queue
class Queue<T> {
 ...
 static <T> Queue<T> check(Stack<T> b, Stack<T> f) {
   if (f.isEmpty())
     return new Queue<T>(f, b.reverse()); (a)
   else
     return new Queue<T>(b, f); (b)
 }

 Queue<T> push(T v) {
   return check(b.push(v), f);
 }

 Queue<T> pop() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   }
   return check(b, f.pop());
 }


                                 Source Code 2/3
Persistent Queue
class Queue<T> {
 ...
 T peek() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   }
   return f.peek();
 }

class Stack<T> {
 ...
 Stack<T> reverse() {
   if (isEmpty() || next.isEmpty())
     return this;
   Stack<T> r = new Stack<T>();
   for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) {
     r = r.push(s.peek());
   }
   return r;
 }

                               Source Code 3/3
Persistent Queue




Structural sharing in persistent queue
Persistent Queue

Beware pathological cases!

 ● What is forward stack is empty, but back stack is full?

 ● And we are going to pop from the same queue N times

 ● Then we get N back back stack reversions!

 ● Lazy evaluation to the rescue — use lazy streams instead of
   strict stacks
Persistent Queue


                               But there is a better way
                                   to design queue!




Monoidally Annotated 2-3 Finger Tree is a versatile data
structure that can be used to build efficient lists, deques,
priority queues, interval trees, ropes, etc.

It is more complex, we will take a look at it later.
Persistent Tree

● It is trivial to convert any ephemeral tree to a persistent one
  by means of path copying

● It works for binary trees, 2-3 trees, B-trees, etc

● The shape of tree is not affected, only mutating algorithms

● In a balanced binary tree at most log N nodes need to be
  copied — quite efficient

● The secret to all persistent data structures is that they all
  are trees! (Yes, lists and hash tables are trees too)
Persistent Tree
Simple Persistent Binary Tree

class SimpleBinaryTree {
 static class Node {
   final K key; (a)
   final V value; (b)
   final Node l, r; (c)

   Node(K key, V value, Node l, Node r) {
     this.key = key;
     this.value = value;
     this.l = l;
     this.r = r;
   }
 }
 ...




                           Source Code 1/2
Simple Persistent Binary Tree

class SimpleBinaryTree {
 ...
 static Node insert(Node n, K key, V value) {
   if (n == null) {
     return new Node(key, value, null, null); (a)
   }
   int cmp = key.compareTo(n.key); (b)
   if (cmp < 0) {
     return new Node(n.key, n.value, (c)
      insert(n.l, key, value), n.r);
   }
   if (cmp > 0) {
     return new Node(n.key, n.value, (d)
      n.l, insert(n.r, key, value));
   }
   return new Node(key, value, n.l, n.r); (e)
 }



                            Source Code 2/2
Persistent Tree

Multiple definitions of persistence:

 ● Immutable data structure with history
 ● Committed to a persistent storage

Append only databases and file systems:

 ● CouchDB uses append only B-Tree
 ● RethinkDB makes append only variant of MySQL
 ● ZFS, BTRFS implement copy-on-write transactions
   and snapshots

Nothing is new under the moon!
Persistent Map

interface Map<K, V> {
  // get value for a key, or null if not found
  V get(K key);
  // make key/value association
  Map<K, V> put(K key, V value);
  // remove key/value association
  Map<K, V> remove(K key);
}




             Remember, no in-place updates
             Mutations create new instances
Persistent Map

Implementation Strategy

 ● Persistent red-black tree for ordered keys
   Time complexity — O(log n)

 ● Persistent hash table for hashable keys
   Time complexity — O(1)
Persistent Hash Table

But how do we implement it?
Copying the whole table would be too expensive!
Persistent Hash Table

Here's the idea: partition hash table into smaller
pieces, organized them as a persistent tree




Nice idea, but how do we navigate in such a tree?
Prefix Tree/Trie
Search is guided by individual letters of a string key




Hash code is just a string of digits!
Persistent Hash Table in Prefix Tree

Represent 32 bit hash codes as strings of 5 bit symbol:

hashCode = CAFEBABE16
level 6 5 4 3 2 1 0
bits 11 00101 01111 11101 01110 10101 11110
symbol 3 5 15 29 14 21 30
Persistent Hash Table

     hashCode = ... xxxxx xxxxx xxxxx xxxxx




Each item is either a key/value pair or a subtree
Persistent Hash Table

class PersistentHashMap {
 abstract class Item<K, V> {}

 class Node<K, V> extends Item<K, V> {
   final Item<K, V> children = new Item<K, V>[32]; (a)
 }

 class Entry<K, V> extends Item<K, V> {
   final int hashCode; (b)
   final K key; (c)
   final V value; (d)
   final Entry<K, V> next; (e)
 }




                        Source Code 1/2
Persistent Hash Table

class PersistentHashMap {
 V get(K key) {
   return root.find(key.hashCode(), key, 0); (a)
 }

 class Node<K, V> extends Item<K, V> {
  V find(int hashCode, K key, int level) {
    int index = (hashCode >>> (level * 5)) & 31; (b)
    Item<K, V> item = children[index]; (c)
    if (item instanceof Node) { (d)
      return ((Node<K, V>) item) (e)
       .find(hashCode, key, level + 1);
    }
    if (item instanceof Entry) { (f)
      return ((Entry<K, V>) item) (g)
       .find(hashCode, key);
    }
    return null;
  }

                          Source Code 2/2
Persistent Hash Table

Do not waste space!

      class PersistentHashMap {
       class Node<K, V> {
         final Item<K, V> children = new Item<K, V>[32]; (a)
       }


 ● Most of the children would be null on deeper levels

 ● The number of arrays grows exponentially as we go deeper

 ● Need to find a way to compact tree

 ● Simply get rid of nulls in arrays!
Persistent Hash Table

    class Node<K, V> {
      final int mask; (a)
      final Item<K, V> children =
        new Item<K, V>[bitCount(mask)]; (b)
    }


● Mask is a 32-bit integer whose bits set to 1 only for those
  array elements that are not null

● Array stores only non-null elements. Its size is the number
  of 1 bits in the mask. Array size varies from 2 to 32
  elements.

● Overhead for null array element is just one bit. Quite good!
Persistent Hash Table

● To test that array has element at index i, simply test if ith bit
  in the mask is 1:

  if ((mask & (1 << i)) != 0) { ...

● To get offset to ith element in the array, count number of 1
  bits lower than i in the mask:

  int offset = bitCount(mask & ((1 << i) - 1));
  if (children[offset] instanceof ...
Persistent List

interface Seq<T> {
  T head(); // get first element
  Seq<T> tail(); // get list without first element
  Seq<T> cons(T v); // append element to head
  Seq<T> snoc(T v); // append element to tail
  Seq<T> concat(Seq<T> that); // join two lists
  int size(); // get number of elements
  T get(int index); // get Nth element
  Seq<T> set(int index, T v); // set Nth element
}




             Remember, no in-place updates
             Mutations create new instances
Persistent List

● There are quite a few ways to implement persistent lists

● But we will not be studying them

● Instead, we will turn our attention to finger trees

● Soon, it will be clear why
Finger Trees

● An incredibly elegant, simple and efficient data structure

● Oh so very versatile, functional programmer's Swiss Army
  knife

● Basic data structure for building random acces sequences,
  deques, priority queues, ropes, interval trees, etc.

● Let's define it in stages
Persistent leafy 2-3 trees

Let's begin with a simple data structure — leafy 2-3 tree

 ● Every intermediate node has either two childrent or three
   children

 ● All values are stored in leafs

 ● Perfectly balanced — all leafs are at the same level
Persistent leafy 2-3 trees
Persistent leafy 2-3 trees



         Leafs contain interesting
         values,
           but what is stored in nodes?
Annotated leafy 2-3 trees

● There must be a way to find interesting values in a tree

● We need to guide search from the root of a tree to its leafs

● Let's add special annotations to nodes

● Use these annotations to find values
Size annotated leafy 2-3 trees

● Each intermediate node is annotated with the size of a
  subtree rooted at this node

● Makes it trivial to find any leaf by its index

● Starting from root, test if index is in the range of its left
  (middle) or right subtree, and repeat recursively for that
  subtree, until a leaf is found
Size annotated leafy 2-3 trees




     Looks like random access list
Priority annotated leafy 2-3 trees

● Each intermediate node is annotated with the highest
  priority of an element in its subtree

● Makes it trivial to find value with the highest priority

● Starting from root, find subtree with the highest priority
  descent recursively into it, until a leaf is found
Priority annotated leafy 2-3 trees




         Looks like priority queue
Monoids

● One interface to unify size, priority (and more!) annotations
  on trees

● A set of values with a "zero" element 0 and a binary
  associative operation ⊕

● Monoid laws:
  0⊕a = a
  a⊕0 = a
  a⊕(b⊕c) = (a⊕b)⊕c
Monoid examples

● Strings with empty string and concatenation
  "" + "a" = "a", "a" + "" = "a"
  "a" + ("b" + "c") = ("a" + "b") + "c"

● Integers with zero and addition
  0 + 1 = 1, 1 + 0 = 1
  1 + (2 + 3) = (1 + 2) + 3

● Integers with one and multiplication
  1 * 2 = 2, 2 * 1 = 1
  2 * (3 * 4) = (2 * 3) * 4

● And many, more of them! (Monoids are everywhere)
Monoid interface

interface Monoid<T extends Monoid<T>> {
  T unit();
  T combine(T that);
}

class String implements Monoid<String> {
 ...

    String unit() {
      return ""; (a)
    }

    String combine(String that) {
      return this + that; (b)
    }
}
Size monoid

class Size implements Monoid<Size> {
 final int size; (a)

    Size(int size) {
      this.size = size;
    }

    Size unit() {
      return new Size(0); (b)
    }

    Size combine(Size that) {
      return new Size(this.size + that.size); (c)
    }
}
Priority monoid

class Priority implements Monoid<Priority> {
 final int priority; (a)

    Priority(int priority) {
      this.priority = priority;
    }

    Priority unit() {
      return new Priority(MAX_INTEGER); (b)
    }

    Priority combine(Priority that) {
      return new Priority(
       Math.min(this.priority, that.priority)); (c)
    }
}
But where do we get monoids from?

● Monoids have nice property of composability

● We can get more monoids by combining existing ones

● But where do we get initial monoids to begin with?

● We need a way to measure values!

● Those measures must be monoids, obviously
    interface Measured<M extends Monoid> {
      M measure();
    }
Let's make a sketch of annotated tree
/** <V> is the type of values
   <M> is the type of monoidal measures of values */
class Tree<M extends Monoid, V extends Measured<M>>
   implements Measured<M> { (a)

 abstract class Leaf<M, V> extends Tree<M, V> {
   final V value; (b)
   override abstract M measure(); (c)
 }

 class Node<M, V> extends Tree<M, V> {
  final Tree<M, V> left, right; (d)
  final M m; (e)
  Node(Tree<M, V> l, Tree<M, V> r) {
    left = l; right = r;
    m = l.measure().combine(r.measure()); (f)
  }
  override final M measure() {                         Pseudocode!
    return m; (g)
  }
Let's make a sketch of annotated tree
...
class Leaf<V> extends Tree<Size, V> {
  final V value;

    override final Size measure() {
      return new Size(1); (a)
    }
}

...
class Leaf<V> extends Tree<Priority, V> {
  final V value;

    override final Priority measure() {
      return new Priority(value.priority()); (b)
    }
}
                                                   Pseudocode!
But that is not finger tree yet!
Finger Tree




... is a just an annotated tree of annotated 2-3 trees!
Finger Tree




Digits, 2-3 trees, fingers and nested levels
Finger Tree

A little bit of Haskell would not hurt:

data Node v a = Node2 v a a | Node3 v a a a

data Digit v a = One v a
        | Two v a a
        | Three v a a a
        | Four v a a a a

data FingerTree v a = Empty
           | Single a
           | Deep v
             (Digit a) (a)
             (FingerTree v (Node v a)) (b)
             (Digit a) (c)
Finger Tree

class FingerTree<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {

 class Empty<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {}

 class Single<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final T v; (a)
  final M m; (b)

 class Deep<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final Digit<M, T> prefix; (c)
  final FingerTree<M, Node<M, T>> middle; (d)
  final Digit<M, T> suffix; (e)
  final M m; (f)



                                  Source Code 1/3
Finger Tree

class Digit<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class One<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a; (b)

 class Two<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b; (c)

 class Three<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c; (d)

 class Four<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c, d; (e)

                                  Source Code 2/3
Finger Tree

class Node<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class Node2<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b; (b)

 class Node3<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b, c; (c)




                                 Source Code 3/3
Finger Tree Interface

Basic operations:

 ● cons, snoc — append/prepend element
 ● concat — join two trees
 ● split — find prefix, element and suffix using predicate

Beyond the scope of this presentation, sorry
Finger Tree Performance

Amortized bounds:

              Finger Tree          2-3 Tree   List
 ● cons, snoc O(1)                 O(log n)   O(1)/O(n)
 ● head, last O(1)                 O(log n)   O(1)/O(n)
 ● concat     O(log min(ℓ1, ℓ2))   O(log n)   O(n)
 ● split      O(log min(n, ℓ-n))   O(log n)   O(n)
 ● index      O(log min(n, ℓ-n)    O(log n)   O(n)
Thanks!

Questions?

More Related Content

Viewers also liked

Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structureeShikshak
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and AlgorithmDhaval Kaneria
 
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsAakash deep Singhal
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)Arvind Devaraj
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURESbca2010
 

Viewers also liked (7)

Data Structures for Robotic Learning
Data Structures for Robotic LearningData Structures for Robotic Learning
Data Structures for Robotic Learning
 
Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structure
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
 
Data Structure
Data StructureData Structure
Data Structure
 
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithms
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURES
 

Similar to Immutable Data Structures Simply Explained with Java Examples

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresqueBret McGuire
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongMario Fusco
 
ParaSail
ParaSail  ParaSail
ParaSail AdaCore
 
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingMario Fusco
 
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersMiles Sabin
 
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersMiles Sabin
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersMiles Sabin
 
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015Fabrízio Mello
 
Introductiontoprogramminginscala
IntroductiontoprogramminginscalaIntroductiontoprogramminginscala
IntroductiontoprogramminginscalaAmuhinda Hungai
 
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapSrinivasan Raghvan
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxmary772
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 

Similar to Immutable Data Structures Simply Explained with Java Examples (20)

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresque
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are Wrong
 
ParaSail
ParaSail  ParaSail
ParaSail
 
Lockless
LocklessLockless
Lockless
 
Yes scala can!
Yes scala can!Yes scala can!
Yes scala can!
 
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional Programming
 
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java Developers
 
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java Developers
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java Developers
 
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015
 
Introductiontoprogramminginscala
IntroductiontoprogramminginscalaIntroductiontoprogramminginscala
Introductiontoprogramminginscala
 
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmap
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Java best practices
Java best practicesJava best practices
Java best practices
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 

More from Vasil Remeniuk

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикVasil Remeniuk
 
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Vasil Remeniuk
 
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Vasil Remeniuk
 
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Vasil Remeniuk
 
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform researchVasil Remeniuk
 
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform ResearchVasil Remeniuk
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaVasil Remeniuk
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform ResearchVasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, PauliusVasil Remeniuk
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Vasil Remeniuk
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform ResearchVasil Remeniuk
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaVasil Remeniuk
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Vasil Remeniuk
 
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform ResearchVasil Remeniuk
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Vasil Remeniuk
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + ElkVasil Remeniuk
 
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхVasil Remeniuk
 

More from Vasil Remeniuk (20)

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и Программатик
 
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14
 
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14
 
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3
 
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform research
 
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform Research
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform Research
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform Research
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius Valatka
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
 
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform Research
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + Elk
 
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событиях
 
ETL со Spark
ETL со SparkETL со Spark
ETL со Spark
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Immutable Data Structures Simply Explained with Java Examples

  • 1. Persistent Data Structures Living in a world where nothing changes but everything evolves - or - A complete idiot's guide to immutability
  • 2. Java Haskell vs ● Warm, soft and cute ● Strange, unfamiliar alien ● Imperative ● Purely functional ● Object oriented ● Everything is different ● Just like good old ● Shocking news! It's not Basic, but with classes like Basic!
  • 3. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } final Node head, tail; void add(final Object v) { for (final Node n = head; n != null; n = n.next) { ... } } } All fields, parameters and variables are automatically immutable, the final is implied everywhere, and there is no way to get rid of it
  • 4. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } It does for me! final Node head, tail; void add(final doesn't make But it Object v) { sense! for (final Node n = head; n != null; n = n.next) { ... } It won't work! } } All fields, parameters and variables are automatically immutable, the final is implied everywhere
  • 5. What is a variable? var·y/ˈve(ə)rē/ vary, varied, varying ● — verb (used with object) Definition: to change or alter, as in form, appearance, character, or substance ● — verb (used without object) Definition: to undergo change in appearance, form, substance, character, etc ● — synonyms: modify, mutate
  • 6. "Variables" in Haskell ● Must be assigned once declared YES: int a = 1; NO: int a; ● Cannot be reassigned YES: final int a = 1; NO: a = 2; These are mathematical variables, not imperative ones!
  • 7. When everything is immutable There is no notion of time: ● Functions take old values, produce new values, nothing is changed in-place ● It does not matter when a function was called, it only matters what arguments it was called with There is no notion of identity: ● Everything is a value, complex data structures are values too ● There is no way to tell if a == b, only if a.equals(b) ● In other words, values are never identical to each other, but may be equal
  • 8. I want my linked list! Basic terminology: ● Ephemeral data structure — everything that is not persistent. Most Java data structures (lists, sets, etc.) are ephemeral. ● Persistent data structure — immutable data structure with history. No in-place modifications. Operations on it create new versions. Older versions are always available. That. Is. Simple. ● The persistence property has nothing to do with persistent storage, like disks! This is a completely different story.
  • 9. I want my linked list! ● In imperative languages, like Java, most data structures are ephemeral by default Designing persistent data structures is somewhat awkward and not always efficient ● In purely functional languages, like Haskell, all data structures are automatically persistent! There is just no other way to make data structures
  • 10. History of updates Making update to a persistent DS instance always creates a new instance that contains this update. The current version is left unmodified.
  • 11. Why should I bother? Is it fun? Hell yeah! But is it practical? Let's see!
  • 12. The free lunch is over! "The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency." — Herb Sutter A commodity hardware (my laptop) The need for writing correct multi-threaded code is constantly increasing
  • 13. Concurrent data structures are hard! Want a concurrent ephemeral linked list? Here are some implementation strategies: ● Coarse-grained synchronization ● Fine-grained synchronization ● Optimistic synchronization ● Lazy synchronization All lock-based — no composition, deadlocks, etc ● Non-blocking synchronization in different flavors And you need the size of a list you are in trouble!
  • 14. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- thread coordination within these structures ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to reason about, test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 15. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- Yes, but are persistent data thread coordination within these structures structures actually simpler? ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 16. Just give up mutability! ● Persistent data structures are easy to reason about in concurrent environment ● The behavior does not depend on how many threads are trying to "modify" it at once ● Therefore persistent data structures are very easy to test and debug
  • 17. The whole picture ● Persistent data structures alone are not sufficient They are an essential part of the picture, but not the whole answer to concurrency ● Inter-thread coordination is needed Threads still need to know what each other thread is doing to agree on a common outcome ● But it can be added "outside" Which gives us complete separation of concerns
  • 18. The whole picture Solving concurrency challenge in a modern language: ● Scala Way — Persistent data structures with message passing ● Clojure Way — Persistent data structures with software transactional memory ● Will likely be mixed in the future
  • 19. Last few words on concurrency ● Persistent data structures are slower than ephemeral ones in sequential use ● But not that much slower! ● We can forgive it, since they give you more functionality, and ephemeral data structures are simply less capable ● And in multiprocessor era, it is better to make things scalable rather than fast
  • 20. Efficient persistent data structures We want persistent data structures to be space and time efficient: ● Structural sharing We want to reuse as many fragments of the previous version as possible ● Path copying We want to copy as few pieces as possible ● Maybe, just maybe lazy evaluation (where available) We don't want nasty pathological cases
  • 21. A case study ● Let's make some persistent data structures in Java ● All these structures consist of Why are you classes with only final fields looking at me?! ● With good amortized asymptotic complexity in most cases
  • 22. Our plan Lets start with some trivial examples ● Stack ● Queue ● Tree The proceed with more advanced structures ● Hash Table ● Finger Tree
  • 23. Trivial Example — Persistent Stack class Stack<T> { final T v; (a) final Stack<T> next; (b) It's just a singly linked Stack() { list of nodes v = null; next = null; size = 0; } Stack(T v, Stack<T> next) { this.v = v; this.next = next; } ... Source Code 1/2
  • 24. Trivial Example — Persistent Stack class Stack<T> { ... Stack<T> push(T v) { return new Stack<T>(v, this); (a) } T peek() { if (next == null) throw new NoSuchElementException(); return v; (b) } Stack<T> pop() { if (next == null) throw new NoSuchElementException(); return next; (c) } Source Code 2/2
  • 25. Trivial Example — Persistent Stack Structural sharing in persistent stack
  • 26. Trivial Example — Persistent Stack Looks familiar? The versions tree!
  • 27. Trivial Example — Persistent Stack Also known as Spaghetti stack or Cactus stack
  • 28. Persistent Queue It's just two stacks combined: When front stack is empty, reverse back stack and ● Back stack to enqueue items use it as front stack ● Front stack to dequeue items
  • 29. Persistent Queue class Queue<T> { // back stack - push elements here final Stack<T> b; (a) // front stack - pop elements from here final Stack<T> f; (b) Queue() { b = f = new Stack<T>(); } Queue(Stack<T> b, Stack<T> f) { this.b = b; this.f = f; } boolean isEmpty() { return f.isEmpty(); (c) } ... Source Code 1/3
  • 30. Persistent Queue class Queue<T> { ... static <T> Queue<T> check(Stack<T> b, Stack<T> f) { if (f.isEmpty()) return new Queue<T>(f, b.reverse()); (a) else return new Queue<T>(b, f); (b) } Queue<T> push(T v) { return check(b.push(v), f); } Queue<T> pop() { if (isEmpty()) { throw new NoSuchElementException(); } return check(b, f.pop()); } Source Code 2/3
  • 31. Persistent Queue class Queue<T> { ... T peek() { if (isEmpty()) { throw new NoSuchElementException(); } return f.peek(); } class Stack<T> { ... Stack<T> reverse() { if (isEmpty() || next.isEmpty()) return this; Stack<T> r = new Stack<T>(); for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) { r = r.push(s.peek()); } return r; } Source Code 3/3
  • 32. Persistent Queue Structural sharing in persistent queue
  • 33. Persistent Queue Beware pathological cases! ● What is forward stack is empty, but back stack is full? ● And we are going to pop from the same queue N times ● Then we get N back back stack reversions! ● Lazy evaluation to the rescue — use lazy streams instead of strict stacks
  • 34. Persistent Queue But there is a better way to design queue! Monoidally Annotated 2-3 Finger Tree is a versatile data structure that can be used to build efficient lists, deques, priority queues, interval trees, ropes, etc. It is more complex, we will take a look at it later.
  • 35. Persistent Tree ● It is trivial to convert any ephemeral tree to a persistent one by means of path copying ● It works for binary trees, 2-3 trees, B-trees, etc ● The shape of tree is not affected, only mutating algorithms ● In a balanced binary tree at most log N nodes need to be copied — quite efficient ● The secret to all persistent data structures is that they all are trees! (Yes, lists and hash tables are trees too)
  • 37. Simple Persistent Binary Tree class SimpleBinaryTree { static class Node { final K key; (a) final V value; (b) final Node l, r; (c) Node(K key, V value, Node l, Node r) { this.key = key; this.value = value; this.l = l; this.r = r; } } ... Source Code 1/2
  • 38. Simple Persistent Binary Tree class SimpleBinaryTree { ... static Node insert(Node n, K key, V value) { if (n == null) { return new Node(key, value, null, null); (a) } int cmp = key.compareTo(n.key); (b) if (cmp < 0) { return new Node(n.key, n.value, (c) insert(n.l, key, value), n.r); } if (cmp > 0) { return new Node(n.key, n.value, (d) n.l, insert(n.r, key, value)); } return new Node(key, value, n.l, n.r); (e) } Source Code 2/2
  • 39. Persistent Tree Multiple definitions of persistence: ● Immutable data structure with history ● Committed to a persistent storage Append only databases and file systems: ● CouchDB uses append only B-Tree ● RethinkDB makes append only variant of MySQL ● ZFS, BTRFS implement copy-on-write transactions and snapshots Nothing is new under the moon!
  • 40. Persistent Map interface Map<K, V> { // get value for a key, or null if not found V get(K key); // make key/value association Map<K, V> put(K key, V value); // remove key/value association Map<K, V> remove(K key); } Remember, no in-place updates Mutations create new instances
  • 41. Persistent Map Implementation Strategy ● Persistent red-black tree for ordered keys Time complexity — O(log n) ● Persistent hash table for hashable keys Time complexity — O(1)
  • 42. Persistent Hash Table But how do we implement it? Copying the whole table would be too expensive!
  • 43. Persistent Hash Table Here's the idea: partition hash table into smaller pieces, organized them as a persistent tree Nice idea, but how do we navigate in such a tree?
  • 44. Prefix Tree/Trie Search is guided by individual letters of a string key Hash code is just a string of digits!
  • 45. Persistent Hash Table in Prefix Tree Represent 32 bit hash codes as strings of 5 bit symbol: hashCode = CAFEBABE16 level 6 5 4 3 2 1 0 bits 11 00101 01111 11101 01110 10101 11110 symbol 3 5 15 29 14 21 30
  • 46. Persistent Hash Table hashCode = ... xxxxx xxxxx xxxxx xxxxx Each item is either a key/value pair or a subtree
  • 47. Persistent Hash Table class PersistentHashMap { abstract class Item<K, V> {} class Node<K, V> extends Item<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } class Entry<K, V> extends Item<K, V> { final int hashCode; (b) final K key; (c) final V value; (d) final Entry<K, V> next; (e) } Source Code 1/2
  • 48. Persistent Hash Table class PersistentHashMap { V get(K key) { return root.find(key.hashCode(), key, 0); (a) } class Node<K, V> extends Item<K, V> { V find(int hashCode, K key, int level) { int index = (hashCode >>> (level * 5)) & 31; (b) Item<K, V> item = children[index]; (c) if (item instanceof Node) { (d) return ((Node<K, V>) item) (e) .find(hashCode, key, level + 1); } if (item instanceof Entry) { (f) return ((Entry<K, V>) item) (g) .find(hashCode, key); } return null; } Source Code 2/2
  • 49. Persistent Hash Table Do not waste space! class PersistentHashMap { class Node<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } ● Most of the children would be null on deeper levels ● The number of arrays grows exponentially as we go deeper ● Need to find a way to compact tree ● Simply get rid of nulls in arrays!
  • 50. Persistent Hash Table class Node<K, V> { final int mask; (a) final Item<K, V> children = new Item<K, V>[bitCount(mask)]; (b) } ● Mask is a 32-bit integer whose bits set to 1 only for those array elements that are not null ● Array stores only non-null elements. Its size is the number of 1 bits in the mask. Array size varies from 2 to 32 elements. ● Overhead for null array element is just one bit. Quite good!
  • 51. Persistent Hash Table ● To test that array has element at index i, simply test if ith bit in the mask is 1: if ((mask & (1 << i)) != 0) { ... ● To get offset to ith element in the array, count number of 1 bits lower than i in the mask: int offset = bitCount(mask & ((1 << i) - 1)); if (children[offset] instanceof ...
  • 52. Persistent List interface Seq<T> { T head(); // get first element Seq<T> tail(); // get list without first element Seq<T> cons(T v); // append element to head Seq<T> snoc(T v); // append element to tail Seq<T> concat(Seq<T> that); // join two lists int size(); // get number of elements T get(int index); // get Nth element Seq<T> set(int index, T v); // set Nth element } Remember, no in-place updates Mutations create new instances
  • 53. Persistent List ● There are quite a few ways to implement persistent lists ● But we will not be studying them ● Instead, we will turn our attention to finger trees ● Soon, it will be clear why
  • 54. Finger Trees ● An incredibly elegant, simple and efficient data structure ● Oh so very versatile, functional programmer's Swiss Army knife ● Basic data structure for building random acces sequences, deques, priority queues, ropes, interval trees, etc. ● Let's define it in stages
  • 55. Persistent leafy 2-3 trees Let's begin with a simple data structure — leafy 2-3 tree ● Every intermediate node has either two childrent or three children ● All values are stored in leafs ● Perfectly balanced — all leafs are at the same level
  • 57. Persistent leafy 2-3 trees Leafs contain interesting values, but what is stored in nodes?
  • 58. Annotated leafy 2-3 trees ● There must be a way to find interesting values in a tree ● We need to guide search from the root of a tree to its leafs ● Let's add special annotations to nodes ● Use these annotations to find values
  • 59. Size annotated leafy 2-3 trees ● Each intermediate node is annotated with the size of a subtree rooted at this node ● Makes it trivial to find any leaf by its index ● Starting from root, test if index is in the range of its left (middle) or right subtree, and repeat recursively for that subtree, until a leaf is found
  • 60. Size annotated leafy 2-3 trees Looks like random access list
  • 61. Priority annotated leafy 2-3 trees ● Each intermediate node is annotated with the highest priority of an element in its subtree ● Makes it trivial to find value with the highest priority ● Starting from root, find subtree with the highest priority descent recursively into it, until a leaf is found
  • 62. Priority annotated leafy 2-3 trees Looks like priority queue
  • 63. Monoids ● One interface to unify size, priority (and more!) annotations on trees ● A set of values with a "zero" element 0 and a binary associative operation ⊕ ● Monoid laws: 0⊕a = a a⊕0 = a a⊕(b⊕c) = (a⊕b)⊕c
  • 64. Monoid examples ● Strings with empty string and concatenation "" + "a" = "a", "a" + "" = "a" "a" + ("b" + "c") = ("a" + "b") + "c" ● Integers with zero and addition 0 + 1 = 1, 1 + 0 = 1 1 + (2 + 3) = (1 + 2) + 3 ● Integers with one and multiplication 1 * 2 = 2, 2 * 1 = 1 2 * (3 * 4) = (2 * 3) * 4 ● And many, more of them! (Monoids are everywhere)
  • 65. Monoid interface interface Monoid<T extends Monoid<T>> { T unit(); T combine(T that); } class String implements Monoid<String> { ... String unit() { return ""; (a) } String combine(String that) { return this + that; (b) } }
  • 66. Size monoid class Size implements Monoid<Size> { final int size; (a) Size(int size) { this.size = size; } Size unit() { return new Size(0); (b) } Size combine(Size that) { return new Size(this.size + that.size); (c) } }
  • 67. Priority monoid class Priority implements Monoid<Priority> { final int priority; (a) Priority(int priority) { this.priority = priority; } Priority unit() { return new Priority(MAX_INTEGER); (b) } Priority combine(Priority that) { return new Priority( Math.min(this.priority, that.priority)); (c) } }
  • 68. But where do we get monoids from? ● Monoids have nice property of composability ● We can get more monoids by combining existing ones ● But where do we get initial monoids to begin with? ● We need a way to measure values! ● Those measures must be monoids, obviously interface Measured<M extends Monoid> { M measure(); }
  • 69. Let's make a sketch of annotated tree /** <V> is the type of values <M> is the type of monoidal measures of values */ class Tree<M extends Monoid, V extends Measured<M>> implements Measured<M> { (a) abstract class Leaf<M, V> extends Tree<M, V> { final V value; (b) override abstract M measure(); (c) } class Node<M, V> extends Tree<M, V> { final Tree<M, V> left, right; (d) final M m; (e) Node(Tree<M, V> l, Tree<M, V> r) { left = l; right = r; m = l.measure().combine(r.measure()); (f) } override final M measure() { Pseudocode! return m; (g) }
  • 70. Let's make a sketch of annotated tree ... class Leaf<V> extends Tree<Size, V> { final V value; override final Size measure() { return new Size(1); (a) } } ... class Leaf<V> extends Tree<Priority, V> { final V value; override final Priority measure() { return new Priority(value.priority()); (b) } } Pseudocode!
  • 71. But that is not finger tree yet!
  • 72. Finger Tree ... is a just an annotated tree of annotated 2-3 trees!
  • 73. Finger Tree Digits, 2-3 trees, fingers and nested levels
  • 74. Finger Tree A little bit of Haskell would not hurt: data Node v a = Node2 v a a | Node3 v a a a data Digit v a = One v a | Two v a a | Three v a a a | Four v a a a a data FingerTree v a = Empty | Single a | Deep v (Digit a) (a) (FingerTree v (Node v a)) (b) (Digit a) (c)
  • 75. Finger Tree class FingerTree<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { class Empty<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> {} class Single<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final T v; (a) final M m; (b) class Deep<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final Digit<M, T> prefix; (c) final FingerTree<M, Node<M, T>> middle; (d) final Digit<M, T> suffix; (e) final M m; (f) Source Code 1/3
  • 76. Finger Tree class Digit<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class One<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a; (b) class Two<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b; (c) class Three<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c; (d) class Four<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c, d; (e) Source Code 2/3
  • 77. Finger Tree class Node<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class Node2<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b; (b) class Node3<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b, c; (c) Source Code 3/3
  • 78. Finger Tree Interface Basic operations: ● cons, snoc — append/prepend element ● concat — join two trees ● split — find prefix, element and suffix using predicate Beyond the scope of this presentation, sorry
  • 79. Finger Tree Performance Amortized bounds: Finger Tree 2-3 Tree List ● cons, snoc O(1) O(log n) O(1)/O(n) ● head, last O(1) O(log n) O(1)/O(n) ● concat O(log min(ℓ1, ℓ2)) O(log n) O(n) ● split O(log min(n, ℓ-n)) O(log n) O(n) ● index O(log min(n, ℓ-n) O(log n) O(n)