Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Implementing data
synchronization API for
mobile apps
Michele Orselli
CTO@Ideato
micheleorselli / ideatosrl
_orso_
mo@ideato.it
Agenda
scenario design choices
implementation alternative approaches
Dealing with conflicts
A1
A2
?
Brownfield project
several mobile apps for tracking user generated
data (calendar, notes, bio data)
iOS & Android
~10 K use...
MongoDB
Legacy App based on Codeigniter
Existing RPC-wannabe-REST API for data sync
Scenario
For every resource
get updates:
POST /m/:app/get/:user_id/:res/:updated_from
create/send updates:
POST /m/:app/update/:use...
api
~6 different resources, ~12 calls per sync
apps sync by polling every 30 sec
every call sync little data
Scenario
Rebuild sync API for old apps + 2 incoming
Enable image synchronization
More efficient than previous API
Challenge
Existing Solutions
Tstamps,
Vector clocks,
CRDTs
syncML,
syncano
Azure Data
sync
Algorithms Protocols/API
Platform
couchDB...
Not Invented Here?
Don't Reinvent The Wheel,
UnlessYou Plan on Learning More About Wheels
J. Atwood
2 different mobile platforms
Several teams with different skill level
Changing storage wasn’t an option
Forcing a particul...
Architecture
c1
server
c2
c3
sync logic
conflicts resolution
thin clients
In the sync domain all resources are managed in
the same way
Implementation
For every app:
one endpoint for getting new data
one endpoint for pushing changes
one endpoint for uploading images
Implem...
GET /apps/:app/users/:user_id/changes[?from=:from]
POST /apps/:app/users/:user_id/merge
POST /upload/:res_id/images
The ne...
Silex Implementation
Silex Implementation
Col 1
Col 2
Col 3
Silex Implementation
Col 1
Col 2
Col 3
Sync Service
Silex Implementation
Col 1
Col 2
Col 3
Sync Service
Silex Implementation
Col 1
Col 2
Col 3
Sync Service
Silex Implementation
Col 1
Col 2
Col 3
Sync Service
Silex Implementation
Col 1
Col 2
Col 3
Sync Service
Silex Implementation
$app->get(“/apps/{mApp}/users/{userId}/merge”,
function ($mApp, $userId, $app, $request)
{
$lastSync ...
Silex Implementation
$app->get(“/apps/{mApp}/users/{userId}/merge”,
function ($mApp, $userId, $app, $request)
{
$lastSync ...
Silex Implementation
$app->get(“/apps/{mApp}/users/{userId}/merge”,
function ($mApp, $userId, $app, $request)
{
$lastSync ...
Silex Implementation
$app->get(“/apps/{mApp}/users/{userId}/merge”,
function ($mApp, $userId, $app, $request)
{
$lastSync ...
Silex Implementation
$app->get(“/apps/{mApp}/users/{userId}/merge”,
function ($mApp, $userId, $app, $request)
{
$lastSync ...
Silex Implementation
$app['mongodb'] = new MongoDb(…);
$app[‘changesRepo’] = new ChangesRepository(
$app[‘mongodb’]
);
$ap...
GET /apps/:app/users/:user_id/changes?from=:from
Get changes
timestamp?
timestamp are inaccurate
server suggests the “from” parameter to be used
in the next request
Server suggest the sync time
Server suggest the sync time
c1 server
GET /changes
{ ‘next’ : 12345,
‘data’: […] }
Server suggest the sync time
c1 server
GET /changes
{ ‘next’ : 12345,
‘data’: […] }
GET /changes?from=12345
{ ‘next’ : 456...
data format
{id:‘1’, ’type’:‘measure’,‘_deleted’: true}
{id: 2’,‘type’:‘note’}
{id:‘3’,‘type’:‘note’}
ps: soft delete all ...
How do we generate an unique id in a distributed
system?
unique identifiers
How do we generate an unique id in a distributed
system?
UUID (RFC 4122): several implementations in PHP
(https://github.c...
How do we generate an unique id in a distributed
system?
Local/Global Id: only the server generates GUIDs
clients use loca...
unique identifiers
c1 server
POST /merge
{ ‘data’: [
{’lid’:‘1’, …},
{‘lid’:‘2’, …}
] }
{ ‘data’: [
{‘guid’:‘58f0bdd7-1400’...
mobile generated data are “temporary” until sync
to server
server handles conflicts resolution
conflict resolution algorithm...
conflict resolution:
domain indipendent: e.g. last-write wins
domain dipendent: use domain knowledge to
resolve
conflict res...
function sync($data) {
foreach ($data as $newRecord) {
$s = findByGuid($newRecord->getGuid());
if (!$s) {
add($newRecord);
...
function sync($data) {
foreach ($data as $newRecord) {
$s = findByGuid($newRecord->getGuid());
if (!$s) {
add($newRecord);
...
function sync($data) {
foreach ($data as $newRecord) {
$s = findByGuid($newRecord->getGuid());
if (!$s) {
add($newRecord);
...
function sync($data) {
foreach ($data as $newRecord) {
$s = findByGuid($newRecord->getGuid());
if (!$s) {
add($newRecord);
...
function sync($data) {
foreach ($data as $newRecord) {
$s = findByGuid($newRecord->getGuid());
if (!$s) {
add($newRecord);
...
conflict resolution algorithm (plain data)
c1
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:‘2’,
‘d...
conflict resolution algorithm (plain data)
c1 server
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:...
conflict resolution algorithm (plain data)
c1 server
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:...
conflict resolution algorithm (plain data)
c1 server
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:...
conflict resolution algorithm (plain data)
c1 server
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:...
conflict resolution algorithm (plain data)
c1 server
{ ‘lid’:‘1’,
‘guid’:‘af54d’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ’guid’...
conflict resolution algorithm (hierarchical data)
How to manage hierarchical data?
{
‘lid’ : ‘123456’,
‘type’ : ‘baby’,
…
}...
conflict resolution algorithm (hierarchical data)
How to manage hierarchical data?
1) sync root record
2) update ids
3) syn...
function syncHierarchical($data) {
sortByHierarchy($data);
foreach ($data as $newRootRecord) {
$s = findByGuid($newRootReco...
function syncHierarchical($data) {
sortByHierarchy($data);
foreach ($data as $newRootRecord) {
$s = findByGuid($newRootReco...
function syncHierarchical($data) {
sortByHierarchy($data);
foreach ($data as $newRootRecord) {
$s = findByGuid($newRootReco...
function syncHierarchical($data) {
sortByHierarchy($data);
foreach ($data as $newRootRecord) {
$s = findByGuid($newRootReco...
…
if ($newRootRecord->updated > $s->updated) {
update($s, $newRecord);
updateRecordIds($newRootRecord, $data);
send($newRo...
…
if ($newRootRecord->updated > $s->updated) {
update($s, $newRecord);
updateRecordIds($newRootRecord, $data);
send($newRo...
conflict resolution algorithm (hierarchical data)
{ ‘lid’:‘1’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:‘2’,
‘parent’:‘1’,
...
conflict resolution algorithm (hierarchical data)
c1
{ ‘lid’:‘1’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:‘2’,
‘parent’:‘1...
conflict resolution algorithm (hierarchical data)
c1
{ ‘lid’:‘1’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:‘2’,
‘parent’:‘3...
conflict resolution algorithm (hierarchical data)
c1
{ ‘lid’:‘1’,
‘data’ :‘AAA’
‘updated’: ’100’ }
{ ‘lid’:‘2’,
‘parent’:‘3...
e.g.“only one temperature can be registered in a
given day”
how to we enforce domain constraints on data?
enforcing domain...
e.g.“only one temperature can be registered in a
given day”
how to we enforce domain constraints on data?
1) relax constra...
e.g.“only one temperature can be registered in a
given day”
how to we enforce domain constraints on data?
1) relax constra...
from findByGuid to findSimilar
first lookup by GUID then by domain rules
“two measures are similar if are referred to the
sam...
enforcing domain constraints
c1 server
enforcing domain constraints
c1 server
{ ’guid’:‘af54d’,
‘when’:‘20141005’ }
enforcing domain constraints
c1 server
{ ‘lid’:‘1’,
‘when’:‘20141005’ }
{ ’guid’:‘af54d’,
‘when’:‘20141005’ }
enforcing domain constraints
c1 server
{ ‘lid’:‘1’,
‘when’:‘20141005’ }
{ ’guid’:‘af54d’,
‘when’:‘20141005’ }
POST /merge
enforcing domain constraints
c1 server
{ ‘lid’:‘1’,
‘when’:‘20141005’ }
{ ’guid’:‘af54d’,
‘when’:‘20141005’ }
POST /merge
enforcing domain constraints
c1 server
{ ‘lid’:‘1’,
‘when’:‘20141005’ }
{ ’guid’:‘af54d’,
‘when’:‘20141005’ }
POST /merge
...
Binary data uploaded via custom endpoint
Sync data remains small
Uploads can be resumed
dealing with binary data
Two steps*
1) data are synchronized
2) related images are uploaded
* this means record without file for a given time
dealin...
dealing with binary data
c1 server
POST /merge
POST /upload/ac435-f8345/image
{ ‘lid’ : 1,
‘type’ :‘baby’,
‘image’ :‘myima...
Implementing this stuff is tricky
Explore existing solution if you can
Understanding the domain is important
What we learn...
vector clocks
Conflict-free Replicated Data Types (CRDTs)
Constraining the types of operations in order to:
- ensure convergence of chang...
Gateways handles sync
Data flows through channels
- partition data set
- authorization
- limit the data
Use revision trees
...
Distributed DB
Eventually/Strong Consistency
Data Types
Configurable conflict resolution
- db level for built-in data types
...
See you inVerona!
jsDay 13th-14th of May
http://2015.jsday.it/
phpDay 15th-16th of May
http://2015.phpday.it/
Questions?
http://www.objc.io/issue-10/sync-case-study.html
http://www.objc.io/issue-10/data-synchronization.html
https://dev.evernot...
Vector Clocks
http://basho.com/why-vector-clocks-are-easy/
http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vect...
phones https://www.flickr.com/photos/15216811@N06/14504964841
wat http://uturncrossfit.com/wp-content/uploads/2014/04/wait-w...
Implementing data sync apis for mibile apps @cloudconf
Implementing data sync apis for mibile apps @cloudconf
Implementing data sync apis for mibile apps @cloudconf
Upcoming SlideShare
Loading in …5
×

Implementing data sync apis for mibile apps @cloudconf

694 views

Published on

Today mobile apps are everywhere. These apps cannot count on a reliable and constant internet connection: working in offline mode is becoming a common pattern. This is quite easy for read-only apps but it becomes rapidly tricky for apps that create data in offline mode. This talk is a case study about a possible architecture for enabling data synchronization in these situations. Some of the topics touched will be:
- id generation
- hierarchical data
- managing differente data types
- sync algorithm

Published in: Internet
  • Be the first to comment

Implementing data sync apis for mibile apps @cloudconf

  1. 1. Implementing data synchronization API for mobile apps
  2. 2. Michele Orselli CTO@Ideato micheleorselli / ideatosrl _orso_ mo@ideato.it
  3. 3. Agenda scenario design choices implementation alternative approaches
  4. 4. Dealing with conflicts A1 A2 ?
  5. 5. Brownfield project several mobile apps for tracking user generated data (calendar, notes, bio data) iOS & Android ~10 K users steadily growing at 1.2 K/month Scenario
  6. 6. MongoDB Legacy App based on Codeigniter Existing RPC-wannabe-REST API for data sync Scenario
  7. 7. For every resource get updates: POST /m/:app/get/:user_id/:res/:updated_from create/send updates: POST /m/:app/update/:user_id/:res_id/:dev_id/:res Scenario
  8. 8. api
  9. 9. ~6 different resources, ~12 calls per sync apps sync by polling every 30 sec every call sync little data Scenario
  10. 10. Rebuild sync API for old apps + 2 incoming Enable image synchronization More efficient than previous API Challenge
  11. 11. Existing Solutions Tstamps, Vector clocks, CRDTs syncML, syncano Azure Data sync Algorithms Protocols/API Platform couchDB, riak Storage
  12. 12. Not Invented Here? Don't Reinvent The Wheel, UnlessYou Plan on Learning More About Wheels J. Atwood
  13. 13. 2 different mobile platforms Several teams with different skill level Changing storage wasn’t an option Forcing a particular technology client side wasn’t an option Architecture
  14. 14. Architecture c1 server c2 c3 sync logic conflicts resolution thin clients
  15. 15. In the sync domain all resources are managed in the same way Implementation
  16. 16. For every app: one endpoint for getting new data one endpoint for pushing changes one endpoint for uploading images Implementation
  17. 17. GET /apps/:app/users/:user_id/changes[?from=:from] POST /apps/:app/users/:user_id/merge POST /upload/:res_id/images The new APIs
  18. 18. Silex Implementation
  19. 19. Silex Implementation Col 1 Col 2 Col 3
  20. 20. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  21. 21. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  22. 22. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  23. 23. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  24. 24. Silex Implementation Col 1 Col 2 Col 3 Sync Service
  25. 25. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  26. 26. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  27. 27. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  28. 28. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  29. 29. Silex Implementation $app->get(“/apps/{mApp}/users/{userId}/merge”, function ($mApp, $userId, $app, $request) { $lastSync = $request->get('from', null); $data = $request->get(‘data’, false); $syncService = $app[‘syncService’]; $syncService->merge($data, $lastSync, $userId); $response = new JsonResponse( $syncService->getResult() ); return $response; }
  30. 30. Silex Implementation $app['mongodb'] = new MongoDb(…); $app[‘changesRepo’] = new ChangesRepository( $app[‘mongodb’] ); $app[‘syncService’] ? new SyncService( $app[‘changesRepo’] );
  31. 31. GET /apps/:app/users/:user_id/changes?from=:from Get changes timestamp?
  32. 32. timestamp are inaccurate server suggests the “from” parameter to be used in the next request Server suggest the sync time
  33. 33. Server suggest the sync time c1 server GET /changes { ‘next’ : 12345, ‘data’: […] }
  34. 34. Server suggest the sync time c1 server GET /changes { ‘next’ : 12345, ‘data’: […] } GET /changes?from=12345 { ‘next’ : 45678, ‘data’: […] }
  35. 35. data format {id:‘1’, ’type’:‘measure’,‘_deleted’: true} {id: 2’,‘type’:‘note’} {id:‘3’,‘type’:‘note’} ps: soft delete all the things! what to transfer
  36. 36. How do we generate an unique id in a distributed system? unique identifiers
  37. 37. How do we generate an unique id in a distributed system? UUID (RFC 4122): several implementations in PHP (https://github.com/ramsey/uuid) unique identifiers
  38. 38. How do we generate an unique id in a distributed system? Local/Global Id: only the server generates GUIDs clients use local ids to manage their records unique identifiers
  39. 39. unique identifiers c1 server POST /merge { ‘data’: [ {’lid’:‘1’, …}, {‘lid’:‘2’, …} ] } { ‘data’: [ {‘guid’:‘58f0bdd7-1400’, ’lid’:‘1’, …}, {‘guid’:‘6f9f3ec9-1400’,‘lid’:‘2’, …} ] }
  40. 40. mobile generated data are “temporary” until sync to server server handles conflicts resolution conflict resolution algorithm (plain data)
  41. 41. conflict resolution: domain indipendent: e.g. last-write wins domain dipendent: use domain knowledge to resolve conflict resolution algorithm (plain data)
  42. 42. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data)
  43. 43. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data)
  44. 44. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) no conflict
  45. 45. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) remote wins
  46. 46. function sync($data) { foreach ($data as $newRecord) { $s = findByGuid($newRecord->getGuid()); if (!$s) { add($newRecord); send($newRecord); continue; } if ($newRecord->updated > $s->updated) { update($s, $newRecord); send($newRecord); continue; } updateRemote($newRecord, $s); } conflict resolution algorithm (plain data) server wins
  47. 47. conflict resolution algorithm (plain data) c1 { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } server { ’guid’:‘af54d’, ‘data’:‘BBB’, ‘updated’ : ’20’ }
  48. 48. conflict resolution algorithm (plain data) c1 server { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } POST /merge { ’guid’:‘af54d’, ‘data’:‘BBB’, ‘updated’ : ’20’ }
  49. 49. conflict resolution algorithm (plain data) c1 server { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’:‘e324f’, ‘data’ :‘hello!’, ‘updated’: ’15’ } { ’guid’:‘af54d’, ‘data’:‘BBB’, ‘updated’ : ’20’ }
  50. 50. conflict resolution algorithm (plain data) c1 server { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’:‘e324f’, ‘data’ :‘hello!’, ‘updated’: ’15’ } { ’guid’:‘af54d’, ‘data’:‘BBB’, ‘updated’ : ’20’ }
  51. 51. conflict resolution algorithm (plain data) c1 server { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’:‘e324f’, ‘data’ :‘hello!’, ‘updated’: ’15’ } { ’guid’:‘af54d’, ‘data’:‘AAA’, ‘updated’ : ’100’ }
  52. 52. conflict resolution algorithm (plain data) c1 server { ‘lid’:‘1’, ‘guid’:‘af54d’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ’guid’:‘af54d’, ‘data’:‘AAA’, ‘updated’ : ’100’ } { ‘lid’:‘2’, ‘data’ :‘hello!’, ‘updated’: ’15’ } POST /merge { ‘guid’:‘e324f’, ‘data’ :‘hello!’, ‘updated’: ’15’ } {‘ok’ : { ’guid’:‘af54d’ }} {‘update’ : { lid:‘2’, ’guid’:‘e324f’ }}
  53. 53. conflict resolution algorithm (hierarchical data) How to manage hierarchical data? { ‘lid’ : ‘123456’, ‘type’ : ‘baby’, … } { ‘lid’ : ‘123456’, ‘type’ : ‘temperature’, ‘baby_id : ‘123456’ }
  54. 54. conflict resolution algorithm (hierarchical data) How to manage hierarchical data? 1) sync root record 2) update ids 3) sync child records { ‘lid’ : ‘123456’, ‘type’ : ‘baby’, … } { ‘lid’ : ‘123456’, ‘type’ : ‘temperature’, ‘baby_id : ‘123456’ }
  55. 55. function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … conflict resolution algorithm (hierarchical data)
  56. 56. function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … conflict resolution algorithm (hierarchical data) parent records first
  57. 57. function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … conflict resolution algorithm (hierarchical data)
  58. 58. function syncHierarchical($data) { sortByHierarchy($data); foreach ($data as $newRootRecord) { $s = findByGuid($newRootRecord->getGuid()); if($newRecord->isRoot()) { if (!$s) { add($newRootRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } … conflict resolution algorithm (hierarchical data) no conflict
  59. 59. … if ($newRootRecord->updated > $s->updated) { update($s, $newRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } else { updateRecordIds($s, $data); updateRemote($newRecord, $s); } } else { sync($data); } } conflict resolution algorithm (hierarchical data) remote wins
  60. 60. … if ($newRootRecord->updated > $s->updated) { update($s, $newRecord); updateRecordIds($newRootRecord, $data); send($newRootRecord); continue; } else { updateRecordIds($s, $data); updateRemote($newRecord, $s); } } else { sync($data); } } conflict resolution algorithm (hierarchical data) server wins
  61. 61. conflict resolution algorithm (hierarchical data) { ‘lid’:‘1’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘parent’:‘1’, ‘data’ :‘hello!’, ‘updated’: ’15’ } c1 server POST /merge
  62. 62. conflict resolution algorithm (hierarchical data) c1 { ‘lid’:‘1’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘parent’:‘1’, ‘data’ :‘hello!’, ‘updated’: ’15’ } server POST /merge { ‘lid’:‘1’, ‘guid’ :‘32ead’, ‘data’ :‘AAA’ ‘updated’: ’100’ }
  63. 63. conflict resolution algorithm (hierarchical data) c1 { ‘lid’:‘1’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘parent’:‘32ead’, ‘data’ :‘hello!’, ‘updated’: ’15’ } server POST /merge { ‘lid’:‘1’, ‘guid’ :‘32ead’, ‘data’ :‘AAA’ ‘updated’: ’100’ }
  64. 64. conflict resolution algorithm (hierarchical data) c1 { ‘lid’:‘1’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘parent’:‘32ead’, ‘data’ :‘hello!’, ‘updated’: ’15’ } server POST /merge { ‘lid’:‘1’, ‘guid’ :‘32ead’, ‘data’ :‘AAA’ ‘updated’: ’100’ } { ‘lid’:‘2’, ‘parent’:‘32ead’, ‘data’ :‘hello!’, ‘updated’: ’15’ } {‘update’ : { ‘lid’:‘1’, ’guid’:‘af54d’ }} {‘update’ : { lid:‘2’, ’guid’:‘e324f’ }}
  65. 65. e.g.“only one temperature can be registered in a given day” how to we enforce domain constraints on data? enforcing domain constraints
  66. 66. e.g.“only one temperature can be registered in a given day” how to we enforce domain constraints on data? 1) relax constraints enforcing domain constraints
  67. 67. e.g.“only one temperature can be registered in a given day” how to we enforce domain constraints on data? 1) relax constraints 2) integrate constraints in sync algorithm enforcing domain constraints
  68. 68. from findByGuid to findSimilar first lookup by GUID then by domain rules “two measures are similar if are referred to the same date” enforcing domain constraints
  69. 69. enforcing domain constraints c1 server
  70. 70. enforcing domain constraints c1 server { ’guid’:‘af54d’, ‘when’:‘20141005’ }
  71. 71. enforcing domain constraints c1 server { ‘lid’:‘1’, ‘when’:‘20141005’ } { ’guid’:‘af54d’, ‘when’:‘20141005’ }
  72. 72. enforcing domain constraints c1 server { ‘lid’:‘1’, ‘when’:‘20141005’ } { ’guid’:‘af54d’, ‘when’:‘20141005’ } POST /merge
  73. 73. enforcing domain constraints c1 server { ‘lid’:‘1’, ‘when’:‘20141005’ } { ’guid’:‘af54d’, ‘when’:‘20141005’ } POST /merge
  74. 74. enforcing domain constraints c1 server { ‘lid’:‘1’, ‘when’:‘20141005’ } { ’guid’:‘af54d’, ‘when’:‘20141005’ } POST /merge { ’guid’:‘af54d’, ‘when’:‘20141005’ }
  75. 75. Binary data uploaded via custom endpoint Sync data remains small Uploads can be resumed dealing with binary data
  76. 76. Two steps* 1) data are synchronized 2) related images are uploaded * this means record without file for a given time dealing with binary data
  77. 77. dealing with binary data c1 server POST /merge POST /upload/ac435-f8345/image { ‘lid’ : 1, ‘type’ :‘baby’, ‘image’ :‘myimage.jpg’ } { ‘lid’ : 1, ‘guid’ :‘ac435-f8345’ }
  78. 78. Implementing this stuff is tricky Explore existing solution if you can Understanding the domain is important What we learned
  79. 79. vector clocks
  80. 80. Conflict-free Replicated Data Types (CRDTs) Constraining the types of operations in order to: - ensure convergence of changes to shared data by uncoordinated, concurrent actors - eliminate network failure modes as a source of error CRDT
  81. 81. Gateways handles sync Data flows through channels - partition data set - authorization - limit the data Use revision trees Couchbase Mobile
  82. 82. Distributed DB Eventually/Strong Consistency Data Types Configurable conflict resolution - db level for built-in data types - application level for custom data Riak
  83. 83. See you inVerona! jsDay 13th-14th of May http://2015.jsday.it/ phpDay 15th-16th of May http://2015.phpday.it/ Questions?
  84. 84. http://www.objc.io/issue-10/sync-case-study.html http://www.objc.io/issue-10/data-synchronization.html https://dev.evernote.com/media/pdf/edam-sync.pdf http://blog.helftone.com/clear-in-the-icloud/ http://strongloop.com/strongblog/node-js-replication-mobile-offline-sync-loopback/ http://blog.denivip.ru/index.php/2014/04/data-syncing-in-core-data-based-ios-apps/?lang=en http://inessential.com/2014/02/15/vesper_sync_diary_8_the_problem_of_un http://culturedcode.com/things/blog/2010/12/state-of-sync-part-1.html http://programmers.stackexchange.com/questions/206310/data-synchronization-in-mobile-apps- multiple-devices-multiple-users http://bricklin.com/offline.htm http://blog.couchbase.com/why-mobile-sync Links
  85. 85. Vector Clocks http://basho.com/why-vector-clocks-are-easy/ http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks http://basho.com/why-vector-clocks-are-hard/ http://blog.8thlight.com/rylan-dirksen/2013/10/04/synchronization-in-a-distributed-system.html CRDTs http://christophermeiklejohn.com/distributed/systems/2013/07/12/readings-in-distributed-systems.html http://www.infoq.com/presentations/problems-distributed-systems https://www.youtube.com/watch?v=qyVNG7fnubQ Riak http://docs.basho.com/riak/latest/dev/using/conflict-resolution/ Couchbase Sync Gateway http://docs.couchbase.com/sync-gateway/ http://www.infoq.com/presentations/sync-mobile-data API http://developers.amiando.com/index.php/REST_API_DataSync https://login.syncano.com/docs/rest/index.html Links
  86. 86. phones https://www.flickr.com/photos/15216811@N06/14504964841 wat http://uturncrossfit.com/wp-content/uploads/2014/04/wait-what.jpg darth http://www.listal.com/viewimage/3825918h blueprint: http://upload.wikimedia.org/wikipedia/commons/5/5e/Joy_Oil_gas_station_blueprints.jpg building: http://s0.geograph.org.uk/geophotos/02/42/74/2427436_96c4cd84.jpg brownfield: http://s0.geograph.org.uk/geophotos/02/04/54/2045448_03a2fb36.jpg no connection: https://www.flickr.com/photos/77018488@N03/9004800239 no internet con https://www.flickr.com/photos/roland/9681237793 vector clocks: http://en.wikipedia.org/wiki/Vector_clock crdts: http://www.infoq.com/presentations/problems-distributed-systems Credits

×