APACHE SLING & FRIENDS TECH MEETUP 
BERLIN, 22-24 S EPTEMBER 2014 
Data replication in Sling 
Tommaso Teofili, Adobe Systems
Use cases 
adaptTo() 2014 2
Replication in AEM 
§ Moving authored content to publish servers 
author 
publish 
adaptTo() 2014 3
Replication in AEM 
§ Moving user generated content back to 
author for moderation 
author 
publish 
adaptTo() 2014 4
Replication in AEM 
§ Moving user generated content to other 
publish instances 
author 
publish 
publish 
adaptTo() 2014 5
Replication module should ... 
§ transfer resources between Sling instances 
§ push from server A to server B 
§ pull from server C to server D 
adaptTo() 2014 6
Example: replication of /foo/bar 
§ HTTP request for pushing /foo/bar 
§ Resources get 
§ packaged 
§ sent 
§ received 
§ persisted 
adaptTo() 2014 7
we’re not in 1999 ... anymore 
adaptTo() 2014 8
NRT sync on N servers 
publish1 
publish2 
author1 
author2 
publish3 
publish4 
publish5 
adaptTo() 2014 9
Oak instances served by same MongoDB 
publish1 
author 
publish2 
oak 
oak 
mongo 
mongo 
mongo 
mongo 
p1 
or 
p2 
adaptTo() 2014 10
Cloud based infrastructures 
P1 
to 
pN 
publish1 
author1 
author2 
publish2 
publish3 
publish4 
publish4 
publish4 
publish4 
publish4 
publishN 
adaptTo() 2014 11
Overview 
adaptTo() 2014 12
Sling Replication 
§ Contributed to Sling in November 2013 
§ First release ? 
§ Main goals 
§ Simple 
§ Resilient 
§ Fast 
adaptTo() 2014 13
Replication agents 
§ Execute replication requests by: 
§ exporting replication packages from a (remote) 
Sling instance 
§ importing replication packages into a (remote) 
Sling instance 
adaptTo() 2014 14
Replication agents 
§ A “push” agent has 
§ A “local” exporter 
§ Creating a package locally for the resources to be 
replicated (e.g. from the underlying JCR repo) 
§ A “remote” importer 
§ Importing the exported package remotely by sending 
it to a designated endpoint to persist it 
adaptTo() 2014 15
Replication agents 
§ A “pull” agent has 
§ A “remote” exporter 
§ Pulling a package from a remote Sling instance 
designated endpoint 
§ A “local” importer 
§ Importing the package locally by persisting it into the 
Sling instance 
adaptTo() 2014 16
Replication agents 
§ A “coordinating” agent has 
§ A “remote” exporter 
§ Pulling a package from a remote endpoint 
§ A “remote” importer 
§ Importing the package remotely into a Sling instance 
adaptTo() 2014 17
Replication agents 
§ A “queuing” agent has 
§ A “local” exporter 
§ Creating a package locally for the resources to be 
replicated (e.g. from the underlying JCR repo) 
§ No importer 
adaptTo() 2014 18
Replication package serialization 
§ Payload to be sent / received 
§ Package builders for (de)serialization 
§ Jackrabbit FileVault based package builder 
adaptTo() 2014 19
Replication queues 
§ Multiple queue providers 
§ Sling Jobs based (sling.event bundle) 
§ In memory 
§ Multiple queue distribution strategies 
§ Single 
§ Error aware 
§ Priority 
adaptTo() 2014 20
Replication rules 
§ Rules can be defined within agents 
§ To trigger replications upon resource changes 
§ To schedule periodic replications 
§ To chain replicate 
§ ... 
adaptTo() 2014 21
Accessing replication resources 
§ Agents as OSGi services 
§ Can be defined via ConfigurationAdmin 
§ Resource providers control 
§ Service access 
§ CRUD operations on configs 
adaptTo() 2014 22
Agent configuration example (1/3) 
§ Config for a ‘push agent’ 
{ 
"jcr:primaryType" : "sling:OsgiConfig", 
"name" : "publish", 
"type" : "simple", 
"packageExporter": [ 
"type=local”, 
"packageBuilder/type=vlt", 
... 
], 
... 
adaptTo() 2014 23
Agent configuration example (2/3) 
... 
"packageImporter" : [ 
"type=remote", 
"endpoints[0]=http://.../replication/services/importers/default”, 
"authenticationFactory/type=service", 
"authenticationFactory/name=user", 
... 
"packageBuilder/type=vlt", 
... 
], 
... 
adaptTo() 2014 24
Agent configuration example (3/3) 
... 
"queueProvider" : [ 
"type=service", 
"name=sjh" 
], 
"queueDistributionStrategy" : [ 
"type=service", 
"name=error" 
] 
} 
adaptTo() 2014 25
Anatomy of a forward replication request 
adaptTo() 2014 26
Anatomy of a fwd. replication (1/4) 
§ HTTP POST on /libs/sling/replication/ 
service/agents/publish with form parameters 
§ Action = ADD 
§ Path = /content/replication 
§ Agent with name ‘publish’ gets picked up by 
the agent resource provider 
adaptTo() 2014 27
Anatomy of a fwd. replication (2/4) 
§ Agent ‘publish’ calls its Exporter (local) to 
create the payload 
§ the ‘local’ exporter calls its Package builder 
§ the ‘vlt’ package builder creates a FileVault 
package for resources under /foo/bar 
adaptTo() 2014 28
Anatomy of a fwd. replication (3/4) 
§ Agent ‘publish’ dispatches the package to its 
queue provider and distribution algorithm 
§ The queue provider is asked to provide queues 
depending on the distribution strategy 
§ The request gets queued 
adaptTo() 2014 29
Anatomy of a fwd. replication (4/4) 
§ When queue entry gets processed the 
(remote) importer is called 
§ Package sent over the wire to an endpoint bound to a 
local importer on the receiving server 
§ The local importer on the receiving side deserializes 
and persists the replication package via its package 
builder 
adaptTo() 2014 30
Reverse replication 
§ Pull agent on author 
§ Periodically polling 
§ Queuing agent on publish 
adaptTo() 2014 31
Event based reverse replication 
§ No scheduled polling 
§ The publish instance notifies author when to 
pull 
§ Notification through server sent events 
adaptTo() 2014 32
Looking forward 
§ Get information on other instances 
§ e.g. replicate to all instances with run mode ‘xyz’ 
§ NRT publish sync 
§ Coordinate agents++ 
§ Performance 
adaptTo() 2014 33
Thanks! 
§ Questions? 
adaptTo() 2014 34

Data replication in Sling

  • 1.
    APACHE SLING &FRIENDS TECH MEETUP BERLIN, 22-24 S EPTEMBER 2014 Data replication in Sling Tommaso Teofili, Adobe Systems
  • 2.
  • 3.
    Replication in AEM § Moving authored content to publish servers author publish adaptTo() 2014 3
  • 4.
    Replication in AEM § Moving user generated content back to author for moderation author publish adaptTo() 2014 4
  • 5.
    Replication in AEM § Moving user generated content to other publish instances author publish publish adaptTo() 2014 5
  • 6.
    Replication module should... § transfer resources between Sling instances § push from server A to server B § pull from server C to server D adaptTo() 2014 6
  • 7.
    Example: replication of/foo/bar § HTTP request for pushing /foo/bar § Resources get § packaged § sent § received § persisted adaptTo() 2014 7
  • 8.
    we’re not in1999 ... anymore adaptTo() 2014 8
  • 9.
    NRT sync onN servers publish1 publish2 author1 author2 publish3 publish4 publish5 adaptTo() 2014 9
  • 10.
    Oak instances servedby same MongoDB publish1 author publish2 oak oak mongo mongo mongo mongo p1 or p2 adaptTo() 2014 10
  • 11.
    Cloud based infrastructures P1 to pN publish1 author1 author2 publish2 publish3 publish4 publish4 publish4 publish4 publish4 publishN adaptTo() 2014 11
  • 12.
  • 13.
    Sling Replication §Contributed to Sling in November 2013 § First release ? § Main goals § Simple § Resilient § Fast adaptTo() 2014 13
  • 14.
    Replication agents §Execute replication requests by: § exporting replication packages from a (remote) Sling instance § importing replication packages into a (remote) Sling instance adaptTo() 2014 14
  • 15.
    Replication agents §A “push” agent has § A “local” exporter § Creating a package locally for the resources to be replicated (e.g. from the underlying JCR repo) § A “remote” importer § Importing the exported package remotely by sending it to a designated endpoint to persist it adaptTo() 2014 15
  • 16.
    Replication agents §A “pull” agent has § A “remote” exporter § Pulling a package from a remote Sling instance designated endpoint § A “local” importer § Importing the package locally by persisting it into the Sling instance adaptTo() 2014 16
  • 17.
    Replication agents §A “coordinating” agent has § A “remote” exporter § Pulling a package from a remote endpoint § A “remote” importer § Importing the package remotely into a Sling instance adaptTo() 2014 17
  • 18.
    Replication agents §A “queuing” agent has § A “local” exporter § Creating a package locally for the resources to be replicated (e.g. from the underlying JCR repo) § No importer adaptTo() 2014 18
  • 19.
    Replication package serialization § Payload to be sent / received § Package builders for (de)serialization § Jackrabbit FileVault based package builder adaptTo() 2014 19
  • 20.
    Replication queues §Multiple queue providers § Sling Jobs based (sling.event bundle) § In memory § Multiple queue distribution strategies § Single § Error aware § Priority adaptTo() 2014 20
  • 21.
    Replication rules §Rules can be defined within agents § To trigger replications upon resource changes § To schedule periodic replications § To chain replicate § ... adaptTo() 2014 21
  • 22.
    Accessing replication resources § Agents as OSGi services § Can be defined via ConfigurationAdmin § Resource providers control § Service access § CRUD operations on configs adaptTo() 2014 22
  • 23.
    Agent configuration example(1/3) § Config for a ‘push agent’ { "jcr:primaryType" : "sling:OsgiConfig", "name" : "publish", "type" : "simple", "packageExporter": [ "type=local”, "packageBuilder/type=vlt", ... ], ... adaptTo() 2014 23
  • 24.
    Agent configuration example(2/3) ... "packageImporter" : [ "type=remote", "endpoints[0]=http://.../replication/services/importers/default”, "authenticationFactory/type=service", "authenticationFactory/name=user", ... "packageBuilder/type=vlt", ... ], ... adaptTo() 2014 24
  • 25.
    Agent configuration example(3/3) ... "queueProvider" : [ "type=service", "name=sjh" ], "queueDistributionStrategy" : [ "type=service", "name=error" ] } adaptTo() 2014 25
  • 26.
    Anatomy of aforward replication request adaptTo() 2014 26
  • 27.
    Anatomy of afwd. replication (1/4) § HTTP POST on /libs/sling/replication/ service/agents/publish with form parameters § Action = ADD § Path = /content/replication § Agent with name ‘publish’ gets picked up by the agent resource provider adaptTo() 2014 27
  • 28.
    Anatomy of afwd. replication (2/4) § Agent ‘publish’ calls its Exporter (local) to create the payload § the ‘local’ exporter calls its Package builder § the ‘vlt’ package builder creates a FileVault package for resources under /foo/bar adaptTo() 2014 28
  • 29.
    Anatomy of afwd. replication (3/4) § Agent ‘publish’ dispatches the package to its queue provider and distribution algorithm § The queue provider is asked to provide queues depending on the distribution strategy § The request gets queued adaptTo() 2014 29
  • 30.
    Anatomy of afwd. replication (4/4) § When queue entry gets processed the (remote) importer is called § Package sent over the wire to an endpoint bound to a local importer on the receiving server § The local importer on the receiving side deserializes and persists the replication package via its package builder adaptTo() 2014 30
  • 31.
    Reverse replication §Pull agent on author § Periodically polling § Queuing agent on publish adaptTo() 2014 31
  • 32.
    Event based reversereplication § No scheduled polling § The publish instance notifies author when to pull § Notification through server sent events adaptTo() 2014 32
  • 33.
    Looking forward §Get information on other instances § e.g. replicate to all instances with run mode ‘xyz’ § NRT publish sync § Coordinate agents++ § Performance adaptTo() 2014 33
  • 34.
    Thanks! § Questions? adaptTo() 2014 34