Moving From Actions & Behaviors to Microservices

Moving from Actions & Behaviors to Microservices
Jeff Potts, Metaversant
@jeffpotts01

How do we make it easier to
integrate Alfresco with other
systems?

Learn. Connect. Collaborate.
“We want to be able to report against metadata in real-time.”
“When this custom property changes we need to notify this other system.”
“We want to improve how Alfresco transforms Word documents into HTML.”
“When content changes we want to run it through an NLP model.”
“Our company has an enterprise search solution that needs to index Alfresco content.”
“We want to replicate content between multiple Alfresco servers.”
Recurring customer requirements

Traditional approaches run in-process
• Custom Alfresco Actions
– Java, deployed to Alfresco WAR
– Triggered by rule on a folder, a UI action, or by a schedule
• Custom Alfresco Behaviors
– Java, deployed to Alfresco WAR
– Bound to a policy on a class of nodes (e.g., specific type or aspect)
• Custom Web Scripts
– Java or JavaScript, deployed to Alfresco WAR
– Triggered by a REST call
• All of these run in Alfresco’s process

Tradeoffs of the traditional approach
• Advantages
– Full access to the Alfresco API
– Runs as the authenticated user or as the system user
– Code is managed with the content model and other customizations
• Disadvantages
– Performance risk
– Requires server restart to deploy
– Requires an Alfresco developer familiar with Alfresco API
• Java & JavaScript are the only practical language options
– Long-running tasks may block user interface
– Scales as Alfresco scales

Event-based integration approach
• Alfresco can be extended to generate generic events when something
happens to a node
• Interested systems
– Listen for Alfresco events
– Filter out what they don’t care about
– Fetch additional data from Alfresco and perform custom logic as needed
• Additional systems can be added without touching Alfresco
• Systems can use different frameworks & languages
• Independently scalable
• Can use Alfresco Kafka as a starting point

Apache Kafka
Alfresco
Microservice
Event
Event
Microservice
Event
Microservice
Event
Move logic out of Alfresco into microservices
alfresco-
kafka
Kafka
Client JAR

Example event JSON
{
"nodeRef": "3f375925-fa87-4e34-9734-b98bed2d483f",
"eventType": "CREATE",
"path":
"/{http://www.alfresco.org/model/application/1.0}company_home/…/{http://www.alfresco
.org/model/content/1.0}test2.txt",
"created": 1497282061322,
"modified": 1497282061322,
"creator": "admin",
"modifier": "admin",
"mimetype": "text/plain",
"contentType": "content",
"siteId": "test-site-1",
"size": 128,
"parent": "06a154e3-4014-4a55-adfa-5e55040fae2d”
}

Alfresco Kafka Listener Example
• Alfresco Kafka
– https://github.com/jpotts/alfresco-kafka
• Alfresco Kafka Listener Example
– https://github.com/jpotts/alfresco-kafka-listener-example
• Demo: https://youtu.be/K40M2gJA7vM

Alfresco Kafka Listener
• Small Spring Boot app
• Runs in a servlet container
• Logs Alfresco Kafka events
• Example/starter code
Apache Kafka
Alfresco
alfresco-kafka-listener
alfresco-
kafka
Kafka
Client JAR
Event
Event

GenerateNodeEvent behavior calls MessageService
@Override
public void onCreateNode(ChildAssociationRef childAssocRef) {
NodeRef nodeRef = childAssocRef.getChildRef();
if (nodeService.exists(nodeRef)) {
messageService.publish(nodeRef, NodeEvent.EventType.CREATE);
}
}

MessageService sends JSON to the Kafka queue
public void init() {
producer = new KafkaProducer<>(createProducerConfig());
}
public void publish(NodeRef nodeRef, NodeEvent.EventType eventType) {
NodeEvent e = nodeTransformer.transform(nodeRef);
e.setEventType(eventType);
publish(e);
}
private void publish(NodeEvent event) {
try {
final String message = mapper.writeValueAsString(event);
if (message != null && message.length() != 0) {
producer.send(new ProducerRecord<String, String>(topic, message));
}
} catch (JsonProcessingException jpe) {
logger.error(jpe);
}
}

Example listener logs event type and node ref
@KafkaListener(topics="${kafka.topic}", group = "${kafka.group}", containerFactory =
"nodeEventKafkaListenerFactory")
public void consumeJson(NodeEvent nodeEvent) {
try {
if (nodeEvent.getContentType().equals("F:cm:systemfolder") ||
nodeEvent.getContentType().equals("F:bpm:package") ||
nodeEvent.getContentType().equals("I:act:actionparameter") ||
nodeEvent.getContentType().equals("I:act:action") ||
nodeEvent.getContentType().equals("D:cm:thumbnail")) {
return;
}
logger.debug("Event: " + nodeEvent.getEventType() + " on " +
nodeEvent.getNodeRef());
} catch (Exception e) {
logger.error(e.getMessage());
}
}

Example: Alfresco reporting
• Customer: “We want to be able to report against metadata in real-time.”
• Solution:
– Spring Boot microservice consumes Alfresco Kafka events
– When a node changes that is interesting, it fetches the metadata using CMIS
– Indexes metadata into Elasticsearch
– Kibana dashboard used to visualize data
• Demo: https://youtu.be/jGZVfP5L8yU

Indexer Service
• Listens for Alfresco Kafka events
• Fetches the Alfresco Node as
JSON
• Indexes the Node JSON into
Elasticsearch
• Deletes objects from
Elasticsearch when DELETE
events occur
Apache Kafka
Alfresco
Elasticsearch Cluster
alf-es-indexer
alfresco-
kafka
Kafka
Client JAR
Event
Event
CMIS GET
Node JSON
Node JSON

KafkaConsumer fetches the node, calls indexer
if (nodeEvent.getEventType().equals(NodeEvent.EventType.CREATE) ||
nodeEvent.getEventType().equals(NodeEvent.EventType.UPDATE) ||
nodeEvent.getEventType().equals(NodeEvent.EventType.PING)) {
Node node = alfrescoService.getNode(nodeEvent.getNodeRef());
// Copy some of the properties from the event onto the node object
if (nodeEvent.getParent() != null) {
node.setParent(nodeEvent.getParent());
}
if (nodeEvent.getSiteId() != null) {
node.setSiteId(nodeEvent.getSiteId());
}
nodeIndexer.index(node);
} else if (nodeEvent.getEventType().equals(NodeEvent.EventType.DELETE)) {
nodeRemover.delete(nodeEvent.getNodeRef());
}

Real World Example: Metadata
Enrichment with NLP

Example: Natural Language Processing
• Customer: “I want to be able to enrich Alfresco metadata by extracting
people, places, and names from content using an NLP model”
• Solution:
– Spring Boot microservice consumes Alfresco Kafka events
– When a node with a “marker” aspect changes, the microservice fetches the
content
– Fingerprints are used to avoid repeatedly processing the same content
– Text is extracted using Apache Tika
– Extracted text is run through Apache OpenNLP to extract people and places
– People and places are written to Alfresco content metadata via CMIS
• Demo: https://youtu.be/H-2TgoUijzY

NLP Enricher Service
• Listens for Alfresco Kafka events
• Fetches Alfresco content
• Extracts people, places, and orgs
• Writes metadata back to Alfresco Apache Kafka
Alfresco
alf-nlp-enricher
alfresco-
kafka
Kafka
Client JAR
Event
Event
CMIS GET
Node JSON
CMIS POST

NodeProcessor uses hash to avoid re-processing file
String hash = null;
try {
hash = HashSumGenerator.getHash(new FileInputStream(new
File(downloadFilePath)));
logger.debug("Hash: " + hash);
} catch (FileNotFoundException fnfe) {
logger.error("Download file not found");
}
// If we have seen this exact content before for this node, stop
String pastHash = pastHashesById.get(id);
if (pastHash != null) {
logger.debug("Past hash: " + pastHash);
if (pastHash.equals(hash)) {
logger.debug("Have already processed this exact file for this id,
skipping");
deleteFile(downloadFilePath);
return;
}
}

Detect sentences, call OpenNLP, update metadata
String sentences[] = sentenceDetector.detect(text);
for (String sentence : sentences) {
locations = addToSet(locationExtractor.extract(sentence), locations);
orgs = addToSet(orgExtractor.extract(sentence), orgs);
names = addToSet(nameExtractor.extract(sentence), names);
}
HashMap<String, Serializable> properties = new HashMap<>();
properties.put(PROP_LOCATIONS, toArrayList(locations));
properties.put(PROP_ORGS, toArrayList(orgs));
properties.put(PROP_NAMES, toArrayList(names));
try {
alfrescoService.updateNode(id, properties);
} catch (AlfrescoServiceException ase) {
logger.error(ase.getMessage());
}

Other potential uses
• Full-text search indexing into standalone search engine
• Synchronizing content with other servers
• Improved HTML transformations
• Notification/subscription service
• Chat integration

Event-based approach disadvantages
• More code/complexity than traditional approach
• User feedback/notification is not straightforward
• Potentially increases the number of “containers” in the IT shop

Event-based approach advantages
• In-line with Alfresco’s stated architectural direction
• Reduces the amount of code running in Alfresco’s process
– Reduces the number of deployments required to support integrations
– Off-loads long-running and/or process-intensive integrations from Alfresco
– Scales independently of Alfresco
• Integrations are more loosely-coupled from Alfresco
– Requires less Alfresco knowledge
– Frees up architectural choices for integrations (not just Java)
• Integration apps are relatively easy to containerize
• Can work alongside traditional approach

Demo Dependency Versions
• Alfresco 5.2.g CE & 5.2.3 Enterprise with
– Metaversant Alfresco Kafka open source add-on 0.0.2
• Apache Kafka 2.12-0.10.2.1
• Elasticsearch 6.3.2
• Kibana 6.3.2
• Custom Spring Boot applications
– Spring Boot 1.5.8
– Elasticsearch High-level Rest Client 6.3.2
– Tika 1.18
– OpenNLP 1.8.4
– Apache Chemistry 1.0.0

Links
• Apache Kafka: http://kafka.apache.org/
• Apache OpenNLP: http://opennlp.apache.org/
• Apache Tika: https://tika.apache.org/
• Elasticsearch: https://www.elastic.co/products/elasticsearch
• Kibana: https://www.elastic.co/products/kibana
• Spring Boot: https://spring.io/projects/spring-boot

See Also
• Apache ManifoldCF
– http://manifoldcf.apache.org/
– Crawler that indexes from repositories like Alfresco into Solr & Elasticsearch
• Apache Stanbol
– http://stanbol.apache.org/
– Semantic engine that can do metadata enhancement and other things
• Apache Camel
– http://camel.apache.org/
– Enterprise integration platform

• Consulting firm focused on solving business problems with open source
Content Management, Workflow, & Search technology
• Founded in 2010
• Clients all over the world in a variety of industries, including:
– Airlines & Aerospace
– Manufacturing
– Construction
– Financial Services
– Higher Education
– Life Sciences
– Professional Services
https://www.metaversant.com

Moving from Actions &
Behaviors to Microservices
Jeff Potts, Metaversant
@jeffpotts01

Moving From Actions & Behaviors to Microservices

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Moving From Actions & Behaviors to Microservices

Similar to Moving From Actions & Behaviors to Microservices (20)

More from Jeff Potts

More from Jeff Potts (20)

Recently uploaded

Recently uploaded (20)

Moving From Actions & Behaviors to Microservices

Editor's Notes