You got schema in my JSON!
Evolving schemas in a schemaless world

!
Philipp Fehre

!
Technical Evangelist at Couchbase
From prototype to production and beyond
“Create an available, scalable search index”

–The Spec
Least downtime possible
Scale horizontally
Not the primary data source
Welcome to NoSQL Land
Data in —> Index
Query —> IDs out
module DB

STORE = {
keyOne: [1, 2],
keyTwo: [1, 3, 4]
}
!

def self.find_all keys
keys.map { |k| STORE.fetch(k.to_sym, []...
def get_search_results keys
results = DB::find_all(keys)
results.flatten.uniq
end
!

get_search_results ["keyOne", “keyTwo...
“Add options for dynamic filtering.”

–The Spec
Data in —> Index + Filter Data
Query —> IDs out
How to evolve the data?
Still in the prototyping phase
Drop and rebuild!
module DB

STORE = {
keyOne: [{id:
{id:
keyTwo: [{id:
{id:
{id:
}

1,
2,
1,
3,
4,

prop:
prop:
prop:
prop:
prop:

"bar"},
...
1 != {id: 1, …}
Your code only understands one format
Your code should only understand one format
Implicit Schema
def get_search_results keys, prop = "bar"
results = DB::find_all(keys, prop)
results.flatten.uniq.map { |e| e[:id] }
end
!...
Time to go live
Time to improve
Part of the concept validation
“Don’t hit the main datastore for preview”

–The Spec
Data in —> Index + Filter Data + Cached Data
Query —> JSON out
How to evolve the data, now?
We can’t rebuild the dataset
In SQL this is done via migrations
class AddFields < ActiveRecord::Migration

def up
change_table :quotas do |t|
t.column :free, :integer
end
Quota.all do |q...
Crawling all data is slow
class AddFields < ActiveRecord::Migration
def up
change_table :quotas do |t|
t.column :free, :integer
end

execute "COMMIT...
During Migration the implicit Schema broken
Migrate data up “on the fly”
Migrate data down “on the fly”
Shorten the time until the new data is usable
Versioning the data
module DB

STORE = {
keyOne: [{id:
{id:
keyTwo: [{id:
{id:
{id:
}

1,
2,
1,
3,
4,

prop:
prop:
prop:
prop:
prop:

"bar", v...
def is_version2? data; data[:vsn] == 2; end
def get_free id; 20; end
def save data; data; end
!

def transfrom_to_v2 data
...
def get_search_results keys
results = DB::find_all(keys, "bar")
results.flatten
.map { |data| transfrom_to_v2 data }
.uniq...
DRY it up: Deprecator
https://github.com/sideshowcoder/deprecator
class Thing
def initialize *args
args.each do |k, v|
self.instance_variable_set "@#{k}", v
end
@version = 0 unless @versio...
Lessons learned
Painless data changes are important
Schemaless does not mean no Schema at all
With freedom comes more responsibility
Agile Code needs Agile Data
Couchbase
NoSQL Database
!

Any Questions?
Philipp Fehre
!
github.com/sideshowcoder
twitter.com/ischi
sideshowcoder.com
Talks to listen to
!

•

Schemalessness: http://cloud.dzone.com/articles/martinfowler-schemalessness

•

Introduction to N...
Upcoming SlideShare
Loading in …5
×

You got schema in my json

905
-1

Published on

Handling schema mirgrations and changing data in NoSQL databases. Agile Code needs Agile Data

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
905
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
1
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

You got schema in my json

  1. 1. You got schema in my JSON! Evolving schemas in a schemaless world ! Philipp Fehre ! Technical Evangelist at Couchbase
  2. 2. From prototype to production and beyond
  3. 3. “Create an available, scalable search index” –The Spec
  4. 4. Least downtime possible
  5. 5. Scale horizontally
  6. 6. Not the primary data source
  7. 7. Welcome to NoSQL Land
  8. 8. Data in —> Index Query —> IDs out
  9. 9. module DB STORE = { keyOne: [1, 2], keyTwo: [1, 3, 4] } ! def self.find_all keys keys.map { |k| STORE.fetch(k.to_sym, []) } end end
  10. 10. def get_search_results keys results = DB::find_all(keys) results.flatten.uniq end ! get_search_results ["keyOne", “keyTwo"] ! # => [1, 2, 3, 4]
  11. 11. “Add options for dynamic filtering.” –The Spec
  12. 12. Data in —> Index + Filter Data Query —> IDs out
  13. 13. How to evolve the data?
  14. 14. Still in the prototyping phase
  15. 15. Drop and rebuild!
  16. 16. module DB STORE = { keyOne: [{id: {id: keyTwo: [{id: {id: {id: } 1, 2, 1, 3, 4, prop: prop: prop: prop: prop: "bar"}, "foo"}], "bar"}, "bar"}, "bar"}] ! def self.find_all keys, prop keys.map { |k| STORE.fetch(k.to_sym, []).map { |e| e if e[:prop] == prop }.compact } # => [[{:id=>1, :prop=>”bar"}], …] end end
  17. 17. 1 != {id: 1, …}
  18. 18. Your code only understands one format
  19. 19. Your code should only understand one format
  20. 20. Implicit Schema
  21. 21. def get_search_results keys, prop = "bar" results = DB::find_all(keys, prop) results.flatten.uniq.map { |e| e[:id] } end ! get_search_results ["keyOne", “keyTwo"] ! # => [1, 3, 4]
  22. 22. Time to go live
  23. 23. Time to improve
  24. 24. Part of the concept validation
  25. 25. “Don’t hit the main datastore for preview” –The Spec
  26. 26. Data in —> Index + Filter Data + Cached Data Query —> JSON out
  27. 27. How to evolve the data, now?
  28. 28. We can’t rebuild the dataset
  29. 29. In SQL this is done via migrations
  30. 30. class AddFields < ActiveRecord::Migration def up change_table :quotas do |t| t.column :free, :integer end Quota.all do |qota| quota.free = calculate_free(quota) quota.save end end end
  31. 31. Crawling all data is slow
  32. 32. class AddFields < ActiveRecord::Migration def up change_table :quotas do |t| t.column :free, :integer end execute "COMMIT" Quota.all do |qota| quota.free = calculate_free(quota) quota.save end end end
  33. 33. During Migration the implicit Schema broken
  34. 34. Migrate data up “on the fly”
  35. 35. Migrate data down “on the fly”
  36. 36. Shorten the time until the new data is usable
  37. 37. Versioning the data
  38. 38. module DB STORE = { keyOne: [{id: {id: keyTwo: [{id: {id: {id: } 1, 2, 1, 3, 4, prop: prop: prop: prop: prop: "bar", vsn: 2, free: 20 }, "foo", vsn: 2, free: 10 }], "bar"}, "bar"}, "bar"}] ! def self.find_all keys, prop keys.map { |k| STORE.fetch(k.to_sym, []).map { |e| e if e[:prop] == prop }.compact } end end
  39. 39. def is_version2? data; data[:vsn] == 2; end def get_free id; 20; end def save data; data; end ! def transfrom_to_v2 data return data if is_version2?(data) ! data[:vsn] = 2 data[:free] = calculate_free(data[:id]) save data end
  40. 40. def get_search_results keys results = DB::find_all(keys, "bar") results.flatten .map { |data| transfrom_to_v2 data } .uniq end ! get_search_results ["keyOne", “keyTwo"] ! # => [{:id=>1, :prop=>"bar", :vsn=>2, :free=>20}, {:id=>3, :prop=>"bar", :vsn=>2, :free=>20}, {:id=>4, :prop=>"bar", :vsn=>2, :free=>20}]
  41. 41. DRY it up: Deprecator https://github.com/sideshowcoder/deprecator
  42. 42. class Thing def initialize *args args.each do |k, v| self.instance_variable_set "@#{k}", v end @version = 0 unless @version end attr_accessor :version ! include Deprecator::Versioning ensure_version 2, :upgrade_to ! def upgrade_to expected_version # handle the version upgrade save end ! def save # save back to the store end end
  43. 43. Lessons learned
  44. 44. Painless data changes are important
  45. 45. Schemaless does not mean no Schema at all
  46. 46. With freedom comes more responsibility
  47. 47. Agile Code needs Agile Data
  48. 48. Couchbase NoSQL Database
  49. 49. ! Any Questions? Philipp Fehre ! github.com/sideshowcoder twitter.com/ischi sideshowcoder.com
  50. 50. Talks to listen to ! • Schemalessness: http://cloud.dzone.com/articles/martinfowler-schemalessness • Introduction to NoSQL: http://www.youtube.com/watch? v=qI_g07C_Q5I

×