Shutl nosql exchange talk

  • 339 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
339
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. How NoSQL allows Shutl to deliver even faster database selection implementation problems and solution lessons learned
  • 2. Volker Pachersenior developer @shutl @vpacherhttp://github.com/vpacher
  • 3. • SaaS platform
  • 4. • SaaS platform• we provide an API for carriers and merchants
  • 5. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri
  • 6. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either:
  • 7. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either: within 90 minutes of purchase
  • 8. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either: within 90 minutes of purchase or a 1 hour window of their choice
  • 9. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either: within 90 minutes of purchase or a 1 hour window of their choice (same day or any day)
  • 10. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either: within 90 minutes of purchase or a 1 hour window of their choice (same day or any day)• fastest delivery to date 8:30 min
  • 11. • SaaS platform• we provide an API for carriers and merchants• built on mysql, rails and ruby mri• customers can chose between a delivery either: within 90 minutes of purchase or a 1 hour window of their choice (same day or any day)• fastest delivery to date 8:30 min• customers: Argos, Maplins, DrEd.com ...
  • 12. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40red on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines Problems?
  • 13. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40red on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines http://xkcd.com/287/
  • 14. Challenges and Problems
  • 15. Challenges and Problems• a whole zoo of np-complete problems
  • 16. Challenges and Problems• a whole zoo of np-complete problems• exponential growth of joins in mysql with added features
  • 17. Challenges and Problems• a whole zoo of np-complete problems• exponential growth of joins in mysql with added features• code base too complex and unmaintanable
  • 18. Challenges and Problems• a whole zoo of np-complete problems• exponential growth of joins in mysql with added features• code base too complex and unmaintanable• api response time growing to large the more data is added
  • 19. The solution for v2:
  • 20. The solution for v2:build a new api on the basis of sinatra and jruby
  • 21. The solution for v2:build a new api on the basis of sinatra and jrubydatabases used:
  • 22. The solution for v2:build a new api on the basis of sinatra and jrubydatabases used: - neo4j embedded
  • 23. The solution for v2:build a new api on the basis of sinatra and jrubydatabases used: - neo4j embedded - mongoDB
  • 24. The solution for v2:build a new api on the basis of sinatra and jrubydatabases used: - neo4j embedded - mongoDB - redis
  • 25. The solution for v2:build a new api on the basis of sinatra and jrubydatabases used: - neo4j embedded - mongoDB - redis - and mysql
  • 26. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines
  • 27. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines relationships are explicit stored
  • 28. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines relationships are explicit stored white board friendly and easier domain modeling
  • 29. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines relationships are explicit stored white board friendly and easier domain modeling schema-less
  • 30. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines relationships are explicit stored white board friendly and easier domain modeling schema-less a graph is its own index
  • 31. Shutl Main Red: Shutl Accent Red: Pantone 485 C Pantone 484 C C0 M100 Y99 K4 C0 M100 Y99 K4 R208 G31 B40 R208 G31 B40 The case for graph databasesred on white HEX D01F28 HEX D01F28 red on lighter tones Shutl Black: Shutl Accent Grey: Pantone BLACK Pantone BLACK C0 M0 Y0 K0100 C0 M0 Y0 K70 @ 70% Please note:Black on white Reverse (white) on darker tones The black logo should never appear on in any ‘dark’ colour background. 5 Branding Guidelines relationships are explicit stored white board friendly and easier domain modeling schema-less a graph is its own index traversals of relationships are easy
  • 32. a graph is its own index
  • 33. available graph databases: neo4j (jvm) flockdb (jvm) DEX (c++) OrientDB (jvm) Sones GraphDB (c#)
  • 34. Neo4jWhy did we chose it:
  • 35. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hell
  • 36. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hellwe can run it embedded in the same jvm
  • 37. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hellwe can run it embedded in the same jvmwe can use jruby as we know ruby very well already
  • 38. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hellwe can run it embedded in the same jvmwe can use jruby as we know ruby very well alreadylots of good ruby libraries are available, we chose the neo4j gemby Andreas Ronge (https://github.com/andreasronge/neo4j)
  • 39. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hellwe can run it embedded in the same jvmwe can use jruby as we know ruby very well alreadylots of good ruby libraries are available, we chose the neo4j gemby Andreas Ronge (https://github.com/andreasronge/neo4j)it speaks cypher
  • 40. Neo4jWhy did we chose it:it didn’t solve our np-complete problems but it solved our join hellwe can run it embedded in the same jvmwe can use jruby as we know ruby very well alreadylots of good ruby libraries are available, we chose the neo4j gemby Andreas Ronge (https://github.com/andreasronge/neo4j)it speaks cypherthe guys from neotech are awesome
  • 41. Neo4j embedded vs. standalone better performance access via rest api and transaction support cypherpros: neo4j gem is available language independent and we can use cypher and code doesn’t need to run traversal on JVM only the code running the not as performant db has access to the db only works with cypher transaction is on a percons: query basis need to write model wrappers for ourselves
  • 42. Neo4jalternatives to the neo4j gem: neography by Max de Marzi (https://github.com/maxdemarzi/neography) and pacer by Darrick Wiebe: (https://github.com/pangloss/pacer-neo4j)
  • 43. Neo4j gotchas and stuff we didn’t know:
  • 44. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools
  • 45. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby
  • 46. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby• switch from relational ‘mentality’ to a graph one was harder thanexpected
  • 47. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby• switch from relational ‘mentality’ to a graph one was harder thanexpected• we have no real solution for migrations so far
  • 48. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby• switch from relational ‘mentality’ to a graph one was harder thanexpected• we have no real solution for migrations so far• seeding an embedded database is hard
  • 49. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby• switch from relational ‘mentality’ to a graph one was harder thanexpected• we have no real solution for migrations so far• seeding an embedded database is hard• encoding Dates and Times that are stored in UTC and work acrosstimezone is non-trivial
  • 50. Neo4j gotchas and stuff we didn’t know:• testing proved to be difficult and we had to write our own tools• some libraries (like cucumber) are not compatible with jruby• switch from relational ‘mentality’ to a graph one was harder thanexpected• we have no real solution for migrations so far• seeding an embedded database is hard• encoding Dates and Times that are stored in UTC and work acrosstimezone is non-trivial• nested datastructure (hashes and array) can’t be stored and needto be converted to json
  • 51. Neo4j testing:• we are using rspec for all tests on the api and practice tdd/bdd• setting up ‘scenarios’ for an integration test was almost impossiblewith existing tools• we decided to built our own tool based on the geoff notationdeveloped by Nigel Small
  • 52. Neo4j geoff:developed by Nigel Small (@technige, http://geoff.nigelsmall.net/)allows modelling of graphs in a human readable form (A) {"name": "Alice"} (B) {"name": "Bob"} (A)-[:KNOWS]->(B) and provides an interface to insert them into an existing graph
  • 53. Neo4j geoff gem (https://github.com/shutl/geoff)• provides a dsl for creating a graph and inserting it into the db• it is open source• it works together with FactoryGirl(https://github.com/thoughtbot/factory_girl)• it supports only the graph structure of the neo4j gem at themoment• we haven’t solved all the issues with event listeners yet
  • 54. Neo4j FactoryGirl https://github.com/thoughtbot/factory_girl # This will guess the User class FactoryGirl.define do factory :user do first_name "John" last_name "Doe" admin false end # This will use the User class (Admin would have been guessed) factory :admin, class: User do first_name "Admin" last_name "User" admin true end end
  • 55. #GemfileNeo4j gem geoff # Basic tree like structure for DSL # the first line generates the class nodes used by Neo4jWrapper # NB Company and Person are classes with the geoff gem Neo4j::NodeMixin(https://github.com/shutl/geoff) Geoff(Company, Person) do company Acme do address "13 Something Road" outgoing :employees do person Geoff person Nigel do name Nigel Small end end end company Github do outgoing :customers do person Tom person Dick person Harry end end person Harry do incoming :customers do company NeoTech end end end
  • 56. root node :company :person :all :all :all :employees Geoff :all :all :all :all Nigel acme Small13 somthing road :all :employees Tom :customers Dick GitHub :customers Harry :customers NeoTech
  • 57. where does FactoryGirl come in?
  • 58. if there are lots of attributes on a node it becomes difficult to read Geoff(Company, Person) do company Acme do address "13 Something Road" contact_name “Jane Doe” contact_email “jane@acme.com” outgoing :employees do person Geoff do first_name “Geoff” last_name “Small” email “geoff@geoff.com” end end end end
  • 59. attributes can be specified in factory-girlFactoryGirl.define do factory :acme, class: Company do address "13 Something Road" contact_name “Jane Doe” contact_email “jane@acme.com” endendFactoryGirl.define do factory :geoff, class: Person do first_name “Geoff” last_name “Small” email “geoff@geoff.com” endend
  • 60. and used in the geoff dslGeoff(Company, Person) do company Acme do geoffactory :acme outgoing :employees do person Geoff do geoffactory :geoff end end endend it also works with traits
  • 61. Why did we chose it:
  • 62. Why did we chose it: very easy to get started (also on dev machines)
  • 63. Why did we chose it: very easy to get started (also on dev machines) it is schemaless
  • 64. Why did we chose it: very easy to get started (also on dev machines) it is schemaless it allows for easy sharding and horizontal scaling
  • 65. Why did we chose it: very easy to get started (also on dev machines) it is schemaless it allows for easy sharding and horizontal scaling it is a json store (as our api is a json api it allows very easy storage of the query results)
  • 66. Why did we chose it: very easy to get started (also on dev machines) it is schemaless it allows for easy sharding and horizontal scaling it is a json store (as our api is a json api it allows very easy storage of the query results) it allows easy storage of nested structure and allows for queries inside structure
  • 67. Why did we chose it: very easy to get started (also on dev machines) it is schemaless it allows for easy sharding and horizontal scaling it is a json store (as our api is a json api it allows very easy storage of the query results) it allows easy storage of nested structure and allows for queries inside structure lots of ruby gems available (we use mongomapper, http:// mongomapper.com/)
  • 68. Why did we chose it: very easy to get started (also on dev machines) it is schemaless it allows for easy sharding and horizontal scaling it is a json store (as our api is a json api it allows very easy storage of the query results) it allows easy storage of nested structure and allows for queries inside structure lots of ruby gems available (we use mongomapper, http:// mongomapper.com/) the 10gen office is 2 floors above ours
  • 69. gotchas, cons and stuff we didn’t know:
  • 70. gotchas, cons and stuff we didn’t know:• it is schemaless and changed to the schema need to handledcarefully
  • 71. gotchas, cons and stuff we didn’t know:• it is schemaless and changed to the schema need to handledcarefully• write and updated follow the ‘fire and forget’ pattern, raising anerror on save needs to be explicitly enabled
  • 72. using the decorator/presenter pattern for schemaless dbs decorator ‘decorates’ object for presentation controller passes decorated object to view retrieves object view mongo DB
  • 73. Why did we chose it:
  • 74. Why did we chose it:fast key-value store for caching and config values
  • 75. Why did we chose it:fast key-value store for caching and config valuesvery easy to implement
  • 76. Why did we chose it:fast key-value store for caching and config valuesvery easy to implementwe have experience with it and with resque
  • 77. Why did we chose it:fast key-value store for caching and config valuesvery easy to implementwe have experience with it and with resqueruby libraries are available to allow easy access and namespacing
  • 78. gotchas, cons and stuff we didn’t know:
  • 79. gotchas, cons and stuff we didn’t know:• we tried to store default values for neo4j in it and found nosolution to include redis updates in a transaction
  • 80. gotchas, cons and stuff we didn’t know:• we tried to store default values for neo4j in it and found nosolution to include redis updates in a transaction• we had some memory issues with resque that were difficult todebug
  • 81. Why are we using it:
  • 82. Why are we using it:we have lots of experience with it
  • 83. Why are we using it:we have lots of experience with itit is very good for data aggregation (sums, groups)
  • 84. Why are we using it:we have lots of experience with itit is very good for data aggregation (sums, groups)ideal for financial transaction data for example
  • 85. Why are we using it:we have lots of experience with itit is very good for data aggregation (sums, groups)ideal for financial transaction data for examplesome of our non-devs invested time to learn sql and we didn’twant to lose the skillsets
  • 86. Why are we using it:we have lots of experience with itit is very good for data aggregation (sums, groups)ideal for financial transaction data for examplesome of our non-devs invested time to learn sql and we didn’twant to lose the skillsetswe have already developed the schemas for v1
  • 87. Graphs are awesome!
  • 88. but
  • 89. chose the right database for the job
  • 90. Any questions?Volker Pachervolker@shutl.co.uk shutl.co.uk@vpacher @shutl