You got schema in my JSON!
Evolving schemas in a schemaless world

!
Philipp Fehre

!
Technical Evangelist at Couchbase
From prototype to production and beyond
“Create an available, scalable search index”

–The Spec
Least downtime possible
Scale horizontally
Not the primary data source
Welcome to NoSQL Land
Data in —> Index
Query —> IDs out
module DB

STORE = {
keyOne: [1, 2],
keyTwo: [1, 3, 4]
}
!

def self.find_all keys
keys.map { |k| STORE.fetch(k.to_sym, []) }
end
end
def get_search_results keys
results = DB::find_all(keys)
results.flatten.uniq
end
!

get_search_results ["keyOne", “keyTwo"]
!

# => [1, 2, 3, 4]
“Add options for dynamic filtering.”

–The Spec
Data in —> Index + Filter Data
Query —> IDs out
How to evolve the data?
Still in the prototyping phase
Drop and rebuild!
module DB

STORE = {
keyOne: [{id:
{id:
keyTwo: [{id:
{id:
{id:
}

1,
2,
1,
3,
4,

prop:
prop:
prop:
prop:
prop:

"bar"},
"foo"}],
"bar"},
"bar"},
"bar"}]

!
def self.find_all keys, prop
keys.map { |k|
STORE.fetch(k.to_sym, []).map { |e|
e if e[:prop] == prop
}.compact
}

# => [[{:id=>1, :prop=>”bar"}], …]
end
end
1 != {id: 1, …}
Your code only understands one format
Your code should only understand one format
Implicit Schema
def get_search_results keys, prop = "bar"
results = DB::find_all(keys, prop)
results.flatten.uniq.map { |e| e[:id] }
end
!

get_search_results ["keyOne", “keyTwo"]
!

# => [1, 3, 4]
Time to go live
Time to improve
Part of the concept validation
“Don’t hit the main datastore for preview”

–The Spec
Data in —> Index + Filter Data + Cached Data
Query —> JSON out
How to evolve the data, now?
We can’t rebuild the dataset
In SQL this is done via migrations
class AddFields < ActiveRecord::Migration

def up
change_table :quotas do |t|
t.column :free, :integer
end
Quota.all do |qota|
quota.free = calculate_free(quota)
quota.save
end
end
end
Crawling all data is slow
class AddFields < ActiveRecord::Migration
def up
change_table :quotas do |t|
t.column :free, :integer
end

execute "COMMIT"
Quota.all do |qota|
quota.free = calculate_free(quota)
quota.save
end
end
end
During Migration the implicit Schema broken
Migrate data up “on the fly”
Migrate data down “on the fly”
Shorten the time until the new data is usable
Versioning the data
module DB

STORE = {
keyOne: [{id:
{id:
keyTwo: [{id:
{id:
{id:
}

1,
2,
1,
3,
4,

prop:
prop:
prop:
prop:
prop:

"bar", vsn: 2, free: 20 },
"foo", vsn: 2, free: 10 }],
"bar"},
"bar"},
"bar"}]

!
def self.find_all keys, prop
keys.map { |k|
STORE.fetch(k.to_sym, []).map { |e|
e if e[:prop] == prop
}.compact
}
end
end
def is_version2? data; data[:vsn] == 2; end
def get_free id; 20; end
def save data; data; end
!

def transfrom_to_v2 data
return data if is_version2?(data)
!

data[:vsn] = 2
data[:free] = calculate_free(data[:id])
save data
end
def get_search_results keys
results = DB::find_all(keys, "bar")
results.flatten
.map { |data| transfrom_to_v2 data }
.uniq
end
!

get_search_results ["keyOne", “keyTwo"]
!

# =>
[{:id=>1, :prop=>"bar", :vsn=>2, :free=>20},
{:id=>3, :prop=>"bar", :vsn=>2, :free=>20},
{:id=>4, :prop=>"bar", :vsn=>2, :free=>20}]
DRY it up: Deprecator
https://github.com/sideshowcoder/deprecator
class Thing
def initialize *args
args.each do |k, v|
self.instance_variable_set "@#{k}", v
end
@version = 0 unless @version
end
attr_accessor :version

!

include Deprecator::Versioning
ensure_version 2, :upgrade_to
!

def upgrade_to expected_version
# handle the version upgrade
save
end
!
def save
# save back to the store
end

end
Lessons learned
Painless data changes are important
Schemaless does not mean no Schema at all
With freedom comes more responsibility
Agile Code needs Agile Data
Couchbase
NoSQL Database
!

Any Questions?
Philipp Fehre
!
github.com/sideshowcoder
twitter.com/ischi
sideshowcoder.com
Talks to listen to
!

•

Schemalessness: http://cloud.dzone.com/articles/martinfowler-schemalessness

•

Introduction to NoSQL: http://www.youtube.com/watch?
v=qI_g07C_Q5I

You got schema in my json