Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Server side data sync for mobile apps with silex

3,096 views

Published on

oday mobile apps are everywhere. These apps cannot count on a reliable and constant internet connection: working in offline mode is becoming a common pattern. This is quite easy for read-only apps but it becomes rapidly tricky for apps that create data in offline mode. This talk is a case study about a possible architecture for enabling data synchronization in these situations. Some of the topics touched will be:
- id generation
- hierarchical data
- managing differente data types
- sync algorithm

Published in: Software

Server side data sync for mobile apps with silex

  1. 1. Implementing data synchronization API for mobile apps with Silex
  2. 2. Michele Orselli CTO@Ideato _orso_ micheleorselli / ideatosrl mo@ideato.it
  3. 3. Agenda scenario design choices implementation alternative approaches
  4. 4. Sync scenario A B C
  5. 5. Sync scenario ABC ABC ABC
  6. 6. Dealing with conflicts A1 A2 ?
  7. 7. Brownfield project Scenario several mobile apps for tracking user generated data (calendar, notes, bio data) iOS & Android ~10 K users steadily growing at 1.2 K/month
  8. 8. MongoDB Scenario Legacy App based on Codeigniter Existing RPC-wannabe-REST API for data sync
  9. 9. For every resource get updates: Scenario POST /m/:app/get/:user_id/:res/:updated_from send updates: POST /m/:app/update/:user_id/:res_id/:dev_id/:res
  10. 10. api
  11. 11. Scenario ~6 different resources, ~12 calls per sync apps sync by polling every 30 sec every call sync little data
  12. 12. Challenge Rebuild sync API for old apps + 2 incoming Enable image synchronization More efficient than previous API
  13. 13. Existing Solutions Tstamps, Vector clocks, CRDTs syncML, syncano Algorithms Protocols/API Azure Data sync Platform couchDB, riak Storage
  14. 14. Not Invented Here? Don't Reinvent The Wheel, Unless You Plan on Learning More About Wheels J. Atwood
  15. 15. Architecture 2 different mobile platforms Several teams with different skill level Changing storage wasn’t an option Forcing a particular technology client side wasn’t an option
  16. 16. Architecture c1 server c2 c3 sync logic conflicts resolution thin clients
  17. 17. Implementation In the sync domain all resources are managed in the same way
  18. 18. Implementation For every app: one endpoint for getting new data one endpoint for pushing changes one endpoint for uploading images
  19. 19. The new APIs GET /apps/:app/users/:user_id/changes[?from=:from] POST /apps/:app/users/:user_id/merge POST /upload/:res_id/images
  20. 20. Silex Implementation
  21. 21. Silex Implementation Col 1 Col 2 Col 3
  22. 22. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  23. 23. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  24. 24. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  25. 25. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  26. 26. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  27. 27. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  28. 28. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  29. 29. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/changes”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $syncService = $app[‘syncService’]; $syncService->sync($lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  30. 30. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/changes”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $syncService = $app[‘syncService’]; $syncService->sync($lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  31. 31. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/changes”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $syncService = $app[‘syncService’]; $syncService->sync($lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  32. 32. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/changes”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $syncService = $app[‘syncService’]; $syncService->sync($lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  33. 33. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  34. 34. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  35. 35. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  36. 36. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  37. 37. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  38. 38. Silex Implementation $app['mongodb'] = new MongoDb(…); $app[‘changesRepo’] = new ChangesRepository( $app[‘mongodb’] ); $app[‘syncService’] ? new SyncService( $app[‘changesRepo’] );
  39. 39. Get changes GET /apps/:app/users/:user_id/changes?from=:from timestamp?
  40. 40. Server suggest the sync time timestamp are inaccurate server suggests the “from” parameter to be used in the next request
  41. 41. Server suggest the sync time GET /changes { ‘next’ : 12345, ‘data’: […] } c1 server
  42. 42. Server suggest the sync time GET /changes { ‘next’ : 12345, ‘data’: […] } c1 server GET /changes?from=12345 { ‘next’ : 45678, ‘data’: […] }
  43. 43. operations: {‘op’: ’add’, id: ‘1’, ’data’:[…]} {‘op’: ’update’, id: ‘1’, ’data’:[…]} {‘op’: ’delete’, id: ‘1’} {‘op’: ’add’, id: ‘2’, ’data’:[…]} states: {id: ‘1’, ’data’:[…]} {id: 2’, ’data’:[…]} {id: ‘3’, ’data’:[…]} what to transfer
  44. 44. what to transfer we choose to transfer states {id: ‘1’, ’type’: ‘measure’, ‘_deleted’: true} {id: 2’, ‘type’: ‘note’} {id: ‘3’, ‘type’: ‘note’} ps: soft delete all the things!
  45. 45. unique identifiers How do we generate an unique id in a distributed system?
  46. 46. unique identifiers How do we generate an unique id in a distributed system? UUID (RFC 4122): several implementations in PHP (https://github.com/ramsey/uuid)
  47. 47. unique identifiers How do we generate an unique id in a distributed system? Local/Global Id: only the server generates GUIDs clients use local ids to manage their records
  48. 48. unique identifiers POST /merge { ‘data’: [ {’lid’: ‘1’, …}, {‘lid’: ‘2’, …} ] } c1 server { ‘data’: [ {‘guid’: ‘58f0bdd7-1400’, ’lid’: ‘1’, …}, {‘guid’: ‘6f9f3ec9-1400’, ‘lid’: ‘2’, …} ] }
  49. 49. conflict resolution algorithm (plain data) mobile generated data are “temporary” until sync to server server handles conflicts resolution
  50. 50. conflict resolution algorithm (plain data) conflict resolution: domain indipendent: e.g. last-write wins domain dipendent: use domain knowledge to resolve
  51. 51. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data)
  52. 52. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data)
  53. 53. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) no conflict
  54. 54. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) remote wins
  55. 55. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) server wins
  56. 56. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } c1 { ’guid’: ‘af54d’, ‘data’: ‘BBB’, ‘updated’ : ’20’ } server
  57. 57. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge { ’guid’: ‘af54d’, ‘data’: ‘BBB’, ‘updated’ : ’20’ } c1 server
  58. 58. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’: ‘e324f’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } { ’guid’: ‘af54d’, ‘data’: ‘BBB’, ‘updated’ : ’20’ } c1 server
  59. 59. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’: ‘e324f’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } { ’guid’: ‘af54d’, ‘data’: ‘BBB’, ‘updated’ : ’20’ } c1 server
  60. 60. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’: ‘e324f’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } { ’guid’: ‘af54d’, ‘data’: ‘AAA’, ‘updated’ : ’100’ } c1 server
  61. 61. conflict resolution algorithm (plain data) { ‘lid’: ‘1’, ‘guid’: ‘af54d’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘guid’: ‘e324f’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } { ’guid’: ‘af54d’, ‘data’: ‘AAA’, ‘updated’ : ’100’ } { ‘lid’: ‘2’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge c1 server {‘ok’ : { ’guid’: ‘af54d’ }} {‘update’ : { lid: ‘2’, ’guid’: ‘e324f’ }}
  62. 62. conflict resolution algorithm (hierarchical data) How to manage hierarchical data? { ‘lid’ : ‘123456’, ‘type’ : ‘baby’, … } { ‘lid’ : ‘123456’, ‘type’ : ‘temperature’, ‘baby_id : ‘123456’ }
  63. 63. conflict resolution algorithm (hierarchical data) How to manage hierarchical data? 1) sync root record 2) update ids 3) sync child records { ‘lid’ : ‘123456’, ‘type’ : ‘baby’, … } { ‘lid’ : ‘123456’, ‘type’ : ‘temperature’, ‘baby_id : ‘123456’ }
  64. 64. conflict resolution algorithm (hierarchical data) function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } …
  65. 65. conflict resolution algorithm (hierarchical data) function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … parent records first
  66. 66. conflict resolution algorithm (hierarchical data) function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } …
  67. 67. conflict resolution algorithm (hierarchical data) function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … no conflict
  68. 68. … if ($newRootRecord->updated > $s->updated) { update($s, $newRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } else { updateRecordIds($s, $data); updateRemote($newRecord, $s); } } else { sync($data); } } conflict resolution algorithm (hierarchical data) remote wins
  69. 69. … if ($newRootRecord->updated > $s->updated) { update($s, $newRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } else { updateRecordIds($s, $data); updateRemote($newRecord, $s); } } else { sync($data); } } conflict resolution algorithm (hierarchical data) server wins
  70. 70. conflict resolution algorithm (hierarchical data) { ‘lid’: ‘1’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘parent’: ‘1’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } POST /merge c1 server
  71. 71. conflict resolution algorithm (hierarchical data) { ‘lid’: ‘1’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘parent’: ‘1’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } c1 server POST /merge { ‘lid’: ‘1’, ‘guid’ : ‘32ead’, ‘data’ : ‘AAA’ ‘updated’: ’100’ }
  72. 72. conflict resolution algorithm (hierarchical data) { ‘lid’: ‘1’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘parent’: ‘32ead’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } c1 server POST /merge { ‘lid’: ‘1’, ‘guid’ : ‘32ead’, ‘data’ : ‘AAA’ ‘updated’: ’100’ }
  73. 73. conflict resolution algorithm (hierarchical data) { ‘lid’: ‘1’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘parent’: ‘32ead’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } c1 server POST /merge { ‘lid’: ‘1’, ‘guid’ : ‘32ead’, ‘data’ : ‘AAA’ ‘updated’: ’100’ } { ‘lid’: ‘2’, ‘parent’: ‘32ead’, ‘data’ : ‘hello!’, ‘updated’: ’15’ } {‘update’ : { ‘lid’: ‘1’, ’guid’: ‘af54d’ }} {‘update’ : { lid: ‘2’, ’guid’: ‘e324f’ }}
  74. 74. enforcing domain constraints e.g. “only one temperature can be registered in a given day” how to we enforce domain constraints on data?
  75. 75. enforcing domain constraints e.g. “only one temperature can be registered in a given day” how to we enforce domain constraints on data? 1) relax constraints
  76. 76. enforcing domain constraints e.g. “only one temperature can be registered in a given day” how to we enforce domain constraints on data? 1) relax constraints 2) integrate constraints in sync algorithm
  77. 77. enforcing domain constraints from findByGuid to findSimilar first lookup by GUID then by domain rules “two measures are similar if are referred to the same date”
  78. 78. enforcing domain constraints c1 server
  79. 79. enforcing domain constraints { ’guid’: ‘af54d’, ‘when’: ‘20141005’ } c1 server
  80. 80. enforcing domain constraints { ‘lid’: ‘1’, ‘when’: ‘20141005’ } { ’guid’: ‘af54d’, ‘when’: ‘20141005’ } c1 server
  81. 81. enforcing domain constraints { ‘lid’: ‘1’, ‘when’: ‘20141005’ } { ’guid’: ‘af54d’, ‘when’: ‘20141005’ } POST /merge c1 server
  82. 82. enforcing domain constraints { ‘lid’: ‘1’, ‘when’: ‘20141005’ } { ’guid’: ‘af54d’, ‘when’: ‘20141005’ } POST /merge c1 server
  83. 83. enforcing domain constraints { ‘lid’: ‘1’, ‘when’: ‘20141005’ } { ’guid’: ‘af54d’, ‘when’: ‘20141005’ } POST /merge c1 server { ’guid’: ‘af54d’, ‘when’: ‘20141005’ }
  84. 84. dealing with binary data Binary data uploaded via custom endpoint Sync data remains small Uploads can be resumed
  85. 85. dealing with binary data Two steps* 1) data are synchronized 2) related images are uploaded * this means record without file for a given time
  86. 86. dealing with binary data POST /merge { ‘lid’ : 1, ‘type’ : ‘baby’, ‘image’ : ‘myimage.jpg’ } { ‘lid’ : 1, ‘guid’ : ‘ac435-f8345’ } c1 server POST /upload/ac435-f8345/image
  87. 87. What we learned Implementing this stuff is tricky Explore existing solution if you can Understanding the domain is important
  88. 88. vector clocks
  89. 89. vector clocks
  90. 90. CRDT Conflict-free Replicated Data Types (CRDTs) Constraining the types of operations in order to: - ensure convergence of changes to shared data by uncoordinated, concurrent actors - eliminate network failure modes as a source of error
  91. 91. Couchbase Mobile Gateways handles sync Data flows through channels - partition data set - authorization - limit the data Use revision trees
  92. 92. Riak Distributed DB Eventually/Strong Consistency Data Types Configurable conflict resolution - db level for built-in data types - application level for custom data
  93. 93. That’s all folks! Questions? Please leave feedback! https://joind.in/12959
  94. 94. http://www.objc.io/issue-10/sync-case-study.html http://www.objc.io/issue-10/data-synchronization.html https://dev.evernote.com/media/pdf/edam-sync.pdf http://blog.helftone.com/clear-in-the-icloud/ http://strongloop.com/strongblog/node-js-replication-mobile-offline-sync-loopback/ http://blog.denivip.ru/index.php/2014/04/data-syncing-in-core-data-based-ios-apps/?lang=en http://inessential.com/2014/02/15/vesper_sync_diary_8_the_problem_of_un http://culturedcode.com/things/blog/2010/12/state-of-sync-part-1.html http://programmers.stackexchange.com/questions/206310/data-synchronization-in-mobile-apps-multiple- devices-multiple-users http://bricklin.com/offline.htm http://blog.couchbase.com/why-mobile-sync Links
  95. 95. Links Vector Clocks http://basho.com/why-vector-clocks-are-easy/ http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks http://basho.com/why-vector-clocks-are-hard/ http://blog.8thlight.com/rylan-dirksen/2013/10/04/synchronization-in-a-distributed-system.html CRDTs http://christophermeiklejohn.com/distributed/systems/2013/07/12/readings-in-distributed-systems.html http://www.infoq.com/presentations/problems-distributed-systems https://www.youtube.com/watch?v=qyVNG7fnubQ Riak http://docs.basho.com/riak/latest/dev/using/conflict-resolution/ Couchbase Sync Gateway http://docs.couchbase.com/sync-gateway/ http://www.infoq.com/presentations/sync-mobile-data API http://developers.amiando.com/index.php/REST_API_DataSync https://login.syncano.com/docs/rest/index.html
  96. 96. Credits phones https://www.flickr.com/photos/15216811@N06/14504964841 wat http://uturncrossfit.com/wp-content/uploads/2014/04/wait-what.jpg darth http://www.listal.com/viewimage/3825918h blueprint: http://upload.wikimedia.org/wikipedia/commons/5/5e/Joy_Oil_gas_station_blueprints.jpg building: http://s0.geograph.org.uk/geophotos/02/42/74/2427436_96c4cd84.jpg brownfield: http://s0.geograph.org.uk/geophotos/02/04/54/2045448_03a2fb36.jpg no connection: https://www.flickr.com/photos/77018488@N03/9004800239 no internet con https://www.flickr.com/photos/roland/9681237793 vector clocks: http://en.wikipedia.org/wiki/Vector_clock crdts: http://www.infoq.com/presentations/problems-distributed-systems

×