RethinkDB and Elixir
Peter Hamilton
SF Elixir Meetup
October 15th, 2015
Fault Tolerance Automatic Failover
Macros Native Language Queries
Concurrency Persistent Feeds
Realtime Applications
RethinkDB Queries
Instead of using string based
protocol like SQL or MongoDB,
RethinkDB uses native language
constructs to create the query
AST.

Supports native anonymous
functions.


Elixir driver includes lambda
macro to use native operators
like >, [], and +
Examples
filter(table(db(“test”),“people”), %{foo: “bar”})

# hard to read
db(“test”)

|> table(“people”)

|> filter(%{foo: “bar”}) # easier to read
table(“people”) |> filter( fn(person) ->

person |> bracket(“friends”)

|> gt(person |> add(5, bracket(“age”))

end) # native function
table(“people”)

|> filter(lambda fn(person) ->

count(person[“friends”]) > 5 + person[“age”]

end) # native function with lambda macro
Feeds
Types of Change Feeds:

Order By / Limit - Keep track of
the top N entries. Send diff to
client if any changes happen.

Point - Keep track of changes
to a single document. Send new
document if any changes
happen.

Sequence - Keep track of
changes to a collection. Send
diff if any changes happen.
Examples
table(“people”)
|> order_by(%{index: “karma”})
|> limit(20)
|> changes # Order By / Limit
table(“people”)
|> get(“5ec49883”)
|> changes # Point
table(“people”)
|> filter(%{team: “blue”})
|> changes # Sequence
Cluster
Each table shard has a primary
plus replicas. One primary per
table is elected via Raft. Re-
election occurs automatically if
primary is unresponsive.

Running RethinkDB in proxy
mode on the same host as the
application is recommended as
it allows efficient query routing
(finding primary server for
query).
Application
Server
RethinkDB
Server
RethinkDB
Server
RethinkDB
Server
RethinkDB
Proxy
Connection
Connection should be added to the supervision tree. Connections multiplex queries so multiple
connections are not necessary for concurrent access to the database. Connection pooling is a
performance optimization (coming soon…).
Examples
{:ok, pid} = RethinkDB.Connection.start_link() # No name necessary
worker(RethinkDB.Connection, [name: :some_name]) # Supervise with name
defmodule FooDatabase, do: use RethinkDB.Connection
worker(FooDatabase, []) # Automatically uses name FooDatabase
c = RethinkDB.connect # Convenience function for non production use
table(“people”) |> RethinkDB.run(pid) # Pass pid to run
table(“people”) |> RethinkDB.run(:some_name) # Pass name to run
table(“people”) |> FooDatabase.run # Automatically uses name FooDatabase
table(“people”) |> RethinkDB.run(c) # Pass in connection
Feeds à la OTP
Example - Point Feed
defmodule PersonChangefeed do
use RethinkDB.Changefeed
import RethinkDB.Query
def init(opts) do
conn = Dict.get(opts, :conn)
id = Dict.get(opts, :id)
query = table(“people”) |> get(id) |> changes
{:subscribe, query, conn, nil}
end
def handle_update(%{“new_val” => data}, _state) do
{:next, data} # Store the new data
end
.
def handle_update(%{“new_val” => nil}, state) do
{:stop, :normal, state} # Entry was deleted
end
def handle_call(:get, _from, state) do
{:reply, state, state} # Respond with data
end
end
Example - Sequence Feed with GenEvent
defmodule TeamChangefeed do
use RethinkDB.Changefeed
import RethinkDB.Query
def init(opts) do
conn = Dict.get(opts, :conn)
manager = Dict.get(opts, :gen_event_manager)
query = table(“teams”) |> changes
{:subscribe, query, conn, manager}
end
def handle_update(update, manager) do
notification = case update do
%{“old_val” => nil} -> # New entry

{:team_added, update[“new_val”]}

%{“new_val” => nil} -> # Entry Removed
{:team_removed, update[“old_val”]}
_ -> # Update to Existing Entry
{:team_update, update[“old_val”], update[“new_val”]}
end
GenEvent.notify(manager, notification)
{:next, manager}
end
end
What next?
Goal: I want to build a RethinkDB driver so powerful
that it’s the reason people choose Elixir. I think OTP
principles combined with Change feeds will be a
huge part of that.
Currently done: Query language, Connections,
Change feeds
Needed: Documentation for Connections and
Change feeds, connection pooling.
Help wanted
Quality Assurance. Pick a query function and compare it
to the Ruby/JavaScript/Python driver. Report any
differences.
Example apps. Whatever you build while tinkering, please
send it my way. I am happy to give feedback and to help
polish it up to showcase as an example.
Performance testing. How many queries can a single
connection handle? How does network latency affect
performance? How does it hold up against the JavaScript
driver?
Thanks!
Shameless plug
I work on Elixir in my spare time. My day job is a full stack
developer at Yahoo on the BrightRoll Video Ads and Data
team, doing everything from low latency web services in C
(directly on top of libevent) to HTML5 video players.
If you are using Elixir and RethinkDB in a production
system, I’d love to talk.
peterghamilton@gmail.com
github: hamiltop

SF Elixir Meetup - RethinkDB

  • 1.
    RethinkDB and Elixir PeterHamilton SF Elixir Meetup October 15th, 2015
  • 2.
    Fault Tolerance AutomaticFailover Macros Native Language Queries Concurrency Persistent Feeds Realtime Applications
  • 3.
    RethinkDB Queries Instead ofusing string based protocol like SQL or MongoDB, RethinkDB uses native language constructs to create the query AST.
 Supports native anonymous functions. 
 Elixir driver includes lambda macro to use native operators like >, [], and + Examples filter(table(db(“test”),“people”), %{foo: “bar”})
 # hard to read db(“test”)
 |> table(“people”)
 |> filter(%{foo: “bar”}) # easier to read table(“people”) |> filter( fn(person) ->
 person |> bracket(“friends”)
 |> gt(person |> add(5, bracket(“age”))
 end) # native function table(“people”)
 |> filter(lambda fn(person) ->
 count(person[“friends”]) > 5 + person[“age”]
 end) # native function with lambda macro
  • 4.
    Feeds Types of ChangeFeeds:
 Order By / Limit - Keep track of the top N entries. Send diff to client if any changes happen.
 Point - Keep track of changes to a single document. Send new document if any changes happen.
 Sequence - Keep track of changes to a collection. Send diff if any changes happen. Examples table(“people”) |> order_by(%{index: “karma”}) |> limit(20) |> changes # Order By / Limit table(“people”) |> get(“5ec49883”) |> changes # Point table(“people”) |> filter(%{team: “blue”}) |> changes # Sequence
  • 5.
    Cluster Each table shardhas a primary plus replicas. One primary per table is elected via Raft. Re- election occurs automatically if primary is unresponsive.
 Running RethinkDB in proxy mode on the same host as the application is recommended as it allows efficient query routing (finding primary server for query). Application Server RethinkDB Server RethinkDB Server RethinkDB Server RethinkDB Proxy
  • 6.
    Connection Connection should beadded to the supervision tree. Connections multiplex queries so multiple connections are not necessary for concurrent access to the database. Connection pooling is a performance optimization (coming soon…). Examples {:ok, pid} = RethinkDB.Connection.start_link() # No name necessary worker(RethinkDB.Connection, [name: :some_name]) # Supervise with name defmodule FooDatabase, do: use RethinkDB.Connection worker(FooDatabase, []) # Automatically uses name FooDatabase c = RethinkDB.connect # Convenience function for non production use table(“people”) |> RethinkDB.run(pid) # Pass pid to run table(“people”) |> RethinkDB.run(:some_name) # Pass name to run table(“people”) |> FooDatabase.run # Automatically uses name FooDatabase table(“people”) |> RethinkDB.run(c) # Pass in connection
  • 7.
    Feeds à laOTP Example - Point Feed defmodule PersonChangefeed do use RethinkDB.Changefeed import RethinkDB.Query def init(opts) do conn = Dict.get(opts, :conn) id = Dict.get(opts, :id) query = table(“people”) |> get(id) |> changes {:subscribe, query, conn, nil} end def handle_update(%{“new_val” => data}, _state) do {:next, data} # Store the new data end . def handle_update(%{“new_val” => nil}, state) do {:stop, :normal, state} # Entry was deleted end def handle_call(:get, _from, state) do {:reply, state, state} # Respond with data end end Example - Sequence Feed with GenEvent defmodule TeamChangefeed do use RethinkDB.Changefeed import RethinkDB.Query def init(opts) do conn = Dict.get(opts, :conn) manager = Dict.get(opts, :gen_event_manager) query = table(“teams”) |> changes {:subscribe, query, conn, manager} end def handle_update(update, manager) do notification = case update do %{“old_val” => nil} -> # New entry
 {:team_added, update[“new_val”]}
 %{“new_val” => nil} -> # Entry Removed {:team_removed, update[“old_val”]} _ -> # Update to Existing Entry {:team_update, update[“old_val”], update[“new_val”]} end GenEvent.notify(manager, notification) {:next, manager} end end
  • 8.
    What next? Goal: Iwant to build a RethinkDB driver so powerful that it’s the reason people choose Elixir. I think OTP principles combined with Change feeds will be a huge part of that. Currently done: Query language, Connections, Change feeds Needed: Documentation for Connections and Change feeds, connection pooling.
  • 9.
    Help wanted Quality Assurance.Pick a query function and compare it to the Ruby/JavaScript/Python driver. Report any differences. Example apps. Whatever you build while tinkering, please send it my way. I am happy to give feedback and to help polish it up to showcase as an example. Performance testing. How many queries can a single connection handle? How does network latency affect performance? How does it hold up against the JavaScript driver?
  • 10.
    Thanks! Shameless plug I workon Elixir in my spare time. My day job is a full stack developer at Yahoo on the BrightRoll Video Ads and Data team, doing everything from low latency web services in C (directly on top of libevent) to HTML5 video players. If you are using Elixir and RethinkDB in a production system, I’d love to talk. peterghamilton@gmail.com github: hamiltop