Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Moving From Actions & Behaviors to Microservices

363 views

Published on

My DevCon 2019 talk discusses how to make it easier to integrate Alfresco with other systems using an event-based approach. Two real world examples are discussed and demonstrated. The first is about reporting against Alfresco metadata. The second is about enriching metadata by running content through a Natural Language Processing (NLP) model. Both solutions work by listening to generic events generated by Alfresco and placed on an Apache Kafka queue. For the reporting example, the Spring Boot consumer subscribes to Kafka events, then fetches metadata via CMIS and indexes that into Elasticsearch. For the NLP example, a separate Spring Boot consumer subscribes to the same events, but in this case, fetches the content, extracts text using Apache Tika, runs the text through Apache OpenNLP, then writes back extracted entities to Alfresco via CMIS. These are relatively simple examples, but illustrate how a de-coupled, asynchronous, event-based approach can make integrating Alfresco with other systems easier.

Published in: Software
  • Be the first to comment

Moving From Actions & Behaviors to Microservices

  1. 1. Moving from Actions & Behaviors to Microservices Jeff Potts, Metaversant @jeffpotts01
  2. 2. How do we make it easier to integrate Alfresco with other systems?
  3. 3. Learn. Connect. Collaborate. “We want to be able to report against metadata in real-time.” “When this custom property changes we need to notify this other system.” “We want to improve how Alfresco transforms Word documents into HTML.” “When content changes we want to run it through an NLP model.” “Our company has an enterprise search solution that needs to index Alfresco content.” “We want to replicate content between multiple Alfresco servers.” Recurring customer requirements
  4. 4. Learn. Connect. Collaborate. Traditional approaches run in-process • Custom Alfresco Actions – Java, deployed to Alfresco WAR – Triggered by rule on a folder, a UI action, or by a schedule • Custom Alfresco Behaviors – Java, deployed to Alfresco WAR – Bound to a policy on a class of nodes (e.g., specific type or aspect) • Custom Web Scripts – Java or JavaScript, deployed to Alfresco WAR – Triggered by a REST call • All of these run in Alfresco’s process
  5. 5. Learn. Connect. Collaborate. Tradeoffs of the traditional approach • Advantages – Full access to the Alfresco API – Runs as the authenticated user or as the system user – Code is managed with the content model and other customizations • Disadvantages – Performance risk – Requires server restart to deploy – Requires an Alfresco developer familiar with Alfresco API • Java & JavaScript are the only practical language options – Long-running tasks may block user interface – Scales as Alfresco scales
  6. 6. An event-based approach
  7. 7. Learn. Connect. Collaborate. Event-based integration approach • Alfresco can be extended to generate generic events when something happens to a node • Interested systems – Listen for Alfresco events – Filter out what they don’t care about – Fetch additional data from Alfresco and perform custom logic as needed • Additional systems can be added without touching Alfresco • Systems can use different frameworks & languages • Independently scalable • Can use Alfresco Kafka as a starting point
  8. 8. Learn. Connect. Collaborate. Apache Kafka Alfresco Microservice Event Event Microservice Event Microservice Event Move logic out of Alfresco into microservices alfresco- kafka Kafka Client JAR
  9. 9. Learn. Connect. Collaborate. Example event JSON { "nodeRef": "3f375925-fa87-4e34-9734-b98bed2d483f", "eventType": "CREATE", "path": "/{http://www.alfresco.org/model/application/1.0}company_home/…/{http://www.alfresco .org/model/content/1.0}test2.txt", "created": 1497282061322, "modified": 1497282061322, "creator": "admin", "modifier": "admin", "mimetype": "text/plain", "contentType": "content", "siteId": "test-site-1", "size": 128, "parent": "06a154e3-4014-4a55-adfa-5e55040fae2d” }
  10. 10. Simple Example
  11. 11. Learn. Connect. Collaborate. Alfresco Kafka Listener Example • Alfresco Kafka – https://github.com/jpotts/alfresco-kafka • Alfresco Kafka Listener Example – https://github.com/jpotts/alfresco-kafka-listener-example • Demo: https://youtu.be/K40M2gJA7vM
  12. 12. Learn. Connect. Collaborate. Alfresco Kafka Listener • Small Spring Boot app • Runs in a servlet container • Logs Alfresco Kafka events • Example/starter code Apache Kafka Alfresco alfresco-kafka-listener alfresco- kafka Kafka Client JAR Event Event
  13. 13. Demo: Alfresco Kafka Listener
  14. 14. Learn. Connect. Collaborate. GenerateNodeEvent behavior calls MessageService @Override public void onCreateNode(ChildAssociationRef childAssocRef) { NodeRef nodeRef = childAssocRef.getChildRef(); if (nodeService.exists(nodeRef)) { messageService.publish(nodeRef, NodeEvent.EventType.CREATE); } }
  15. 15. Learn. Connect. Collaborate. MessageService sends JSON to the Kafka queue public void init() { producer = new KafkaProducer<>(createProducerConfig()); } public void publish(NodeRef nodeRef, NodeEvent.EventType eventType) { NodeEvent e = nodeTransformer.transform(nodeRef); e.setEventType(eventType); publish(e); } private void publish(NodeEvent event) { try { final String message = mapper.writeValueAsString(event); if (message != null && message.length() != 0) { producer.send(new ProducerRecord<String, String>(topic, message)); } } catch (JsonProcessingException jpe) { logger.error(jpe); } }
  16. 16. Learn. Connect. Collaborate. Example listener logs event type and node ref @KafkaListener(topics="${kafka.topic}", group = "${kafka.group}", containerFactory = "nodeEventKafkaListenerFactory") public void consumeJson(NodeEvent nodeEvent) { try { if (nodeEvent.getContentType().equals("F:cm:systemfolder") || nodeEvent.getContentType().equals("F:bpm:package") || nodeEvent.getContentType().equals("I:act:actionparameter") || nodeEvent.getContentType().equals("I:act:action") || nodeEvent.getContentType().equals("D:cm:thumbnail")) { return; } logger.debug("Event: " + nodeEvent.getEventType() + " on " + nodeEvent.getNodeRef()); } catch (Exception e) { logger.error(e.getMessage()); } }
  17. 17. Real World Example: Reporting
  18. 18. Learn. Connect. Collaborate. Example: Alfresco reporting • Customer: “We want to be able to report against metadata in real-time.” • Solution: – Spring Boot microservice consumes Alfresco Kafka events – When a node changes that is interesting, it fetches the metadata using CMIS – Indexes metadata into Elasticsearch – Kibana dashboard used to visualize data • Demo: https://youtu.be/jGZVfP5L8yU
  19. 19. Learn. Connect. Collaborate. Indexer Service • Small Spring Boot app • Runs in a servlet container • Listens for Alfresco Kafka events • Fetches the Alfresco Node as JSON • Indexes the Node JSON into Elasticsearch • Deletes objects from Elasticsearch when DELETE events occur Apache Kafka Alfresco Elasticsearch Cluster alf-es-indexer alfresco- kafka Kafka Client JAR Event Event CMIS GET Node JSON Node JSON
  20. 20. Demo: alf-es-indexer
  21. 21. Learn. Connect. Collaborate. KafkaConsumer fetches the node, calls indexer if (nodeEvent.getEventType().equals(NodeEvent.EventType.CREATE) || nodeEvent.getEventType().equals(NodeEvent.EventType.UPDATE) || nodeEvent.getEventType().equals(NodeEvent.EventType.PING)) { Node node = alfrescoService.getNode(nodeEvent.getNodeRef()); // Copy some of the properties from the event onto the node object if (nodeEvent.getParent() != null) { node.setParent(nodeEvent.getParent()); } if (nodeEvent.getSiteId() != null) { node.setSiteId(nodeEvent.getSiteId()); } nodeIndexer.index(node); } else if (nodeEvent.getEventType().equals(NodeEvent.EventType.DELETE)) { nodeRemover.delete(nodeEvent.getNodeRef()); }
  22. 22. Learn. Connect. Collaborate.
  23. 23. Real World Example: Metadata Enrichment with NLP
  24. 24. Learn. Connect. Collaborate. Example: Natural Language Processing • Customer: “I want to be able to enrich Alfresco metadata by extracting people, places, and names from content using an NLP model” • Solution: – Spring Boot microservice consumes Alfresco Kafka events – When a node with a “marker” aspect changes, the microservice fetches the content – Fingerprints are used to avoid repeatedly processing the same content – Text is extracted using Apache Tika – Extracted text is run through Apache OpenNLP to extract people and places – People and places are written to Alfresco content metadata via CMIS • Demo: https://youtu.be/H-2TgoUijzY
  25. 25. Learn. Connect. Collaborate. NLP Enricher Service • Small Spring Boot app • Runs in a servlet container • Listens for Alfresco Kafka events • Fetches Alfresco content • Extracts people, places, and orgs • Writes metadata back to Alfresco Apache Kafka Alfresco alf-nlp-enricher alfresco- kafka Kafka Client JAR Event Event CMIS GET Node JSON CMIS POST
  26. 26. Demo: alf-nlp-enricher
  27. 27. Learn. Connect. Collaborate. NodeProcessor uses hash to avoid re-processing file String hash = null; try { hash = HashSumGenerator.getHash(new FileInputStream(new File(downloadFilePath))); logger.debug("Hash: " + hash); } catch (FileNotFoundException fnfe) { logger.error("Download file not found"); } // If we have seen this exact content before for this node, stop String pastHash = pastHashesById.get(id); if (pastHash != null) { logger.debug("Past hash: " + pastHash); if (pastHash.equals(hash)) { logger.debug("Have already processed this exact file for this id, skipping"); deleteFile(downloadFilePath); return; } }
  28. 28. Learn. Connect. Collaborate. Detect sentences, call OpenNLP, update metadata String sentences[] = sentenceDetector.detect(text); for (String sentence : sentences) { locations = addToSet(locationExtractor.extract(sentence), locations); orgs = addToSet(orgExtractor.extract(sentence), orgs); names = addToSet(nameExtractor.extract(sentence), names); } HashMap<String, Serializable> properties = new HashMap<>(); properties.put(PROP_LOCATIONS, toArrayList(locations)); properties.put(PROP_ORGS, toArrayList(orgs)); properties.put(PROP_NAMES, toArrayList(names)); try { alfrescoService.updateNode(id, properties); } catch (AlfrescoServiceException ase) { logger.error(ase.getMessage()); }
  29. 29. Learn. Connect. Collaborate.
  30. 30. Considerations
  31. 31. Learn. Connect. Collaborate. Apache Kafka Alfresco Microservice Event Event Microservice Event Microservice Event Move logic out of Alfresco into microservices alfresco- kafka Kafka Client JAR
  32. 32. Learn. Connect. Collaborate. Other potential uses • Full-text search indexing into standalone search engine • Synchronizing content with other servers • Improved HTML transformations • Notification/subscription service • Chat integration
  33. 33. Learn. Connect. Collaborate. Event-based approach disadvantages • More code/complexity than traditional approach • User feedback/notification is not straightforward • Potentially increases the number of “containers” in the IT shop
  34. 34. Learn. Connect. Collaborate. Event-based approach advantages • In-line with Alfresco’s stated architectural direction • Reduces the amount of code running in Alfresco’s process – Reduces the number of deployments required to support integrations – Off-loads long-running and/or process-intensive integrations from Alfresco – Scales independently of Alfresco • Integrations are more loosely-coupled from Alfresco – Requires less Alfresco knowledge – Frees up architectural choices for integrations (not just Java) • Integration apps are relatively easy to containerize • Can work alongside traditional approach
  35. 35. Learn. Connect. Collaborate. Demo Dependency Versions • Alfresco 5.2.g CE & 5.2.3 Enterprise with – Metaversant Alfresco Kafka open source add-on 0.0.2 • Apache Kafka 2.12-0.10.2.1 • Elasticsearch 6.3.2 • Kibana 6.3.2 • Custom Spring Boot applications – Spring Boot 1.5.8 – Elasticsearch High-level Rest Client 6.3.2 – Tika 1.18 – OpenNLP 1.8.4 – Apache Chemistry 1.0.0
  36. 36. Learn. Connect. Collaborate. Links • Apache Kafka: http://kafka.apache.org/ • Apache OpenNLP: http://opennlp.apache.org/ • Apache Tika: https://tika.apache.org/ • Elasticsearch: https://www.elastic.co/products/elasticsearch • Kibana: https://www.elastic.co/products/kibana • Spring Boot: https://spring.io/projects/spring-boot
  37. 37. Learn. Connect. Collaborate. See Also • Apache ManifoldCF – http://manifoldcf.apache.org/ – Crawler that indexes from repositories like Alfresco into Solr & Elasticsearch • Apache Stanbol – http://stanbol.apache.org/ – Semantic engine that can do metadata enhancement and other things • Apache Camel – http://camel.apache.org/ – Enterprise integration platform
  38. 38. Learn. Connect. Collaborate. • Consulting firm focused on solving business problems with open source Content Management, Workflow, & Search technology • Founded in 2010 • Clients all over the world in a variety of industries, including: – Airlines & Aerospace – Manufacturing – Construction – Financial Services – Higher Education – Life Sciences – Professional Services https://www.metaversant.com
  39. 39. Moving from Actions & Behaviors to Microservices Jeff Potts, Metaversant @jeffpotts01

×