SlideShare a Scribd company logo
1 of 24
Download to read offline
Apache ManifoldCF
Overview

● The story
● What is ManifoldCF?
● Why ManifoldCF?
● Architecture
● The 0.3-incubating version
● The 0.4-incubating version
● What's new in the 0.5-incubating
● The book: ManifoldCF in Action
● Demo
● Resources
The story

The original ManifoldCF code base was granted by MetaCarta Inc.,
to the Apache Software Foundation in December 2009.

The MetaCarta effort represented more than five years of successful
development and testing in multiple, challenging enterprise
environments.

The project is in the Apache Incubator because the community was
not yet diverse enough, but now the project is towards graduation.
                                 ^__^
What is ManifoldCF?
● Open Source crawler
   ○ schedule jobs to create indexes
      ■ get contents from repositories
      ■ push contents on search servers
What is ManifoldCF?
● Open Source crawler
   ○ schedule jobs to create indexes
      ■ get contents from repositories
      ■ push contents on search servers

● Out-Of-The-Box it is distributed as J2EE web apps
   ○ REST API
   ○ Authority Service
   ○ Crawler UI

● Can be embedded in any Java application
Why ManifoldCF?
● Reliability
● Incremental
● Multi repositories
● Security model
● Monitoring
Why ManifoldCF? - Reliability

Jobs scheduling and configuration are stored in the database
to maintain the state of all the executions
Why ManifoldCF? - Incremental

Jobs can be optionally configured to re-visit contents
incrementally
Why ManifoldCF? - Multi repositories

Jobs can retrieve contents from the following repositories:
 ● CMIS-compliant
 ● Alfresco
 ● IBM FileNet
 ● EMC Documentum
 ● Microsoft SharePoint
 ● OpenText LiveLink
 ● Autonomy Meridio
 ● Memex Patriarch
 ● Windows Share/DFS
 ● Generic JDBC
 ● Generic Filesystem
 ● Generic RSS and Web
Why ManifoldCF? - Multi repositories

Jobs can ingest contents to the following search servers:
 ● ElasticSearch
 ● OpenSearchServer
 ● Apache Solr
 ● MetaCarta GTS
Why ManifoldCF? - Security model

Retrieve per-content ACLs
Why ManifoldCF? - Monitoring

UI Crawler allows you to:
 ● configure jobs and connectors
 ● monitor jobs execution
 ● monitor contents ingestion
    ○ status reports
        ■ document status
        ■ queue status
    ○ history reports
        ■ simple history
        ■ maximum activity
        ■ maximum bandwidth
        ■ result histogram
Architecture

● Pull Agent Daemon
   ○ Jobs
       ■ Repository Connectors
       ■ Output Connectors
       ■ Authority Connectors
Architecture

● Pull Agent Daemon (the core service)
   ○ Jobs (execute the ingestion tasks)
       ■ Repository Connectors (retrieve contents)
       ■ Output Connectors (ingest contents)
       ■ Authority Connectors (retrieve ACLs)
Architecture
Architecture - Job

A job is an ingestion work that consists of:
     ○ verbal description
     ○ repository connection
         ■ authority connection (optional)
     ○ metadata mapping
     ○ output connection (search server)
     ○ crawling model
     ○ scheduling information (on demand or time ranges)
Architecture - Job
The 0.3-incubating version

● CMIS Repository Connector
● OpenSearchServer Output Connector
● Scripting Language
● New Maven build process
● Several bug fixes
The 0.4-incubating version

● Alfresco Connector
● JDBC Connector now supports MySQL
● CMIS Connector upgraded to OpenCMIS 0.5.0
● Several bug fixes
What's new in the 0.5-incubating

● Apache Velocity for connectors UI templates
● ElasticSearch Output Connector
● CMIS Connector upgraded to OpenCMIS 0.6.0
● Prebuild connector support: just add jars and go!
● New Japanese localization
● Several bug fixes
The book: ManifoldCF in Action

ManifoldCF in Action
by Karl Wright
published by Manning


Karl is the original developer and the
principal committer of Apache ManifoldCF


The book is available at the following site:
http://www.manning.com/wright
DEMO
Resources


Homepage:
http://incubator.apache.org/connectors



Download page:
http://incubator.apache.org/connectors/download.html
Thank you for your attention!

More Related Content

What's hot

Spring Framework
Spring Framework  Spring Framework
Spring Framework tola99
 
Web services - A Practical Approach
Web services - A Practical ApproachWeb services - A Practical Approach
Web services - A Practical ApproachMadhaiyan Muthu
 
Pre-launch Checklist for Going Production on AWS
Pre-launch Checklist for Going Production on AWS Pre-launch Checklist for Going Production on AWS
Pre-launch Checklist for Going Production on AWS Amazon Web Services
 
WebLogic Scripting Tool Overview
WebLogic Scripting Tool OverviewWebLogic Scripting Tool Overview
WebLogic Scripting Tool OverviewJames Bayer
 
Enterprise java unit-2_chapter-1
Enterprise  java unit-2_chapter-1Enterprise  java unit-2_chapter-1
Enterprise java unit-2_chapter-1sandeep54552
 
Active mq Installation and Master Slave setup
Active mq Installation and Master Slave setupActive mq Installation and Master Slave setup
Active mq Installation and Master Slave setupRamakrishna Narkedamilli
 
Using AWS Key Management Service for Secure Workloads
Using AWS Key Management Service for Secure WorkloadsUsing AWS Key Management Service for Secure Workloads
Using AWS Key Management Service for Secure WorkloadsAmazon Web Services
 
Spring Framework - AOP
Spring Framework - AOPSpring Framework - AOP
Spring Framework - AOPDzmitry Naskou
 
Serverless Microservices Communication with Amazon EventBridge
Serverless Microservices Communication with Amazon EventBridgeServerless Microservices Communication with Amazon EventBridge
Serverless Microservices Communication with Amazon EventBridgeSheenBrisals
 

What's hot (20)

Spring Framework
Spring Framework  Spring Framework
Spring Framework
 
Web services - A Practical Approach
Web services - A Practical ApproachWeb services - A Practical Approach
Web services - A Practical Approach
 
Getting Started with Amazon EC2
Getting Started with Amazon EC2Getting Started with Amazon EC2
Getting Started with Amazon EC2
 
Ppt of blogs
Ppt of blogsPpt of blogs
Ppt of blogs
 
Pre-launch Checklist for Going Production on AWS
Pre-launch Checklist for Going Production on AWS Pre-launch Checklist for Going Production on AWS
Pre-launch Checklist for Going Production on AWS
 
WebLogic Scripting Tool Overview
WebLogic Scripting Tool OverviewWebLogic Scripting Tool Overview
WebLogic Scripting Tool Overview
 
AWS EC2 and ELB troubleshooting
AWS EC2 and ELB troubleshootingAWS EC2 and ELB troubleshooting
AWS EC2 and ELB troubleshooting
 
Intro to AWS Lambda
Intro to AWS Lambda Intro to AWS Lambda
Intro to AWS Lambda
 
Enterprise java unit-2_chapter-1
Enterprise  java unit-2_chapter-1Enterprise  java unit-2_chapter-1
Enterprise java unit-2_chapter-1
 
Active mq Installation and Master Slave setup
Active mq Installation and Master Slave setupActive mq Installation and Master Slave setup
Active mq Installation and Master Slave setup
 
Amazon services ec2
Amazon services ec2Amazon services ec2
Amazon services ec2
 
Using AWS Key Management Service for Secure Workloads
Using AWS Key Management Service for Secure WorkloadsUsing AWS Key Management Service for Secure Workloads
Using AWS Key Management Service for Secure Workloads
 
An Introduction To REST API
An Introduction To REST APIAn Introduction To REST API
An Introduction To REST API
 
How To Select A Training Vendor Scorecard
How To Select A Training Vendor ScorecardHow To Select A Training Vendor Scorecard
How To Select A Training Vendor Scorecard
 
Mule SAP connector
Mule SAP connectorMule SAP connector
Mule SAP connector
 
Application Migrations
Application MigrationsApplication Migrations
Application Migrations
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 
Spring Framework - AOP
Spring Framework - AOPSpring Framework - AOP
Spring Framework - AOP
 
Serverless Microservices Communication with Amazon EventBridge
Serverless Microservices Communication with Amazon EventBridgeServerless Microservices Communication with Amazon EventBridge
Serverless Microservices Communication with Amazon EventBridge
 
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3) AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
 

Viewers also liked

Integrate ManifoldCF with Solr
Integrate ManifoldCF with SolrIntegrate ManifoldCF with Solr
Integrate ManifoldCF with Solrfrancelabs
 
Integrating Alfresco with Portals
Integrating Alfresco with PortalsIntegrating Alfresco with Portals
Integrating Alfresco with PortalsPiergiorgio Lucidi
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...lucenerevolution
 
Web scraping with nutch solr
Web scraping with nutch solrWeb scraping with nutch solr
Web scraping with nutch solrMike Frampton
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-WebinarEdureka!
 

Viewers also liked (7)

Apache ManifoldCF
Apache ManifoldCFApache ManifoldCF
Apache ManifoldCF
 
Integrate ManifoldCF with Solr
Integrate ManifoldCF with SolrIntegrate ManifoldCF with Solr
Integrate ManifoldCF with Solr
 
Super Size Your Search
Super Size Your SearchSuper Size Your Search
Super Size Your Search
 
Integrating Alfresco with Portals
Integrating Alfresco with PortalsIntegrating Alfresco with Portals
Integrating Alfresco with Portals
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...
 
Web scraping with nutch solr
Web scraping with nutch solrWeb scraping with nutch solr
Web scraping with nutch solr
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 

Similar to Apache ManifoldCF

Alfresco WebScript Connector for Apache ManifoldCF
Alfresco WebScript Connector for Apache ManifoldCFAlfresco WebScript Connector for Apache ManifoldCF
Alfresco WebScript Connector for Apache ManifoldCFPiergiorgio Lucidi
 
Apache ManifoldCF @ Linux Day 2012
Apache ManifoldCF @ Linux Day 2012Apache ManifoldCF @ Linux Day 2012
Apache ManifoldCF @ Linux Day 2012Piergiorgio Lucidi
 
Openshift service broker and catalog ocp-meetup july 2018
Openshift service broker and catalog  ocp-meetup july 2018Openshift service broker and catalog  ocp-meetup july 2018
Openshift service broker and catalog ocp-meetup july 2018Michael Calizo
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_MicroservicesJason Varghese
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsFedir RYKHTIK
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseMarcelo Ochoa
 
Enterprise Integration Patterns with Apache Camel
Enterprise Integration Patterns with Apache CamelEnterprise Integration Patterns with Apache Camel
Enterprise Integration Patterns with Apache CamelIoan Eugen Stan
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPRCamille Salas
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)Igor Talevski
 
Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideMohanraj Thirumoorthy
 
Eclipse Apricot
Eclipse ApricotEclipse Apricot
Eclipse ApricotNuxeo
 
Kotlin REST & GraphQL API
Kotlin REST & GraphQL APIKotlin REST & GraphQL API
Kotlin REST & GraphQL APISean O'Brien
 
Introducing Apricot, The Eclipse Content Management Platform
Introducing Apricot, The Eclipse Content Management PlatformIntroducing Apricot, The Eclipse Content Management Platform
Introducing Apricot, The Eclipse Content Management PlatformNuxeo
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale3scale
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBYuval Ararat
 
Modern application development with oracle cloud sangam17
Modern application development with oracle cloud sangam17Modern application development with oracle cloud sangam17
Modern application development with oracle cloud sangam17Vinay Kumar
 
Design Summit - Technology Vision - Oleg Barenboim and Jason Frey
Design Summit - Technology Vision - Oleg Barenboim and Jason FreyDesign Summit - Technology Vision - Oleg Barenboim and Jason Frey
Design Summit - Technology Vision - Oleg Barenboim and Jason FreyManageIQ
 

Similar to Apache ManifoldCF (20)

Alfresco WebScript Connector for Apache ManifoldCF
Alfresco WebScript Connector for Apache ManifoldCFAlfresco WebScript Connector for Apache ManifoldCF
Alfresco WebScript Connector for Apache ManifoldCF
 
Apache ManifoldCF @ Linux Day 2012
Apache ManifoldCF @ Linux Day 2012Apache ManifoldCF @ Linux Day 2012
Apache ManifoldCF @ Linux Day 2012
 
Openshift service broker and catalog ocp-meetup july 2018
Openshift service broker and catalog  ocp-meetup july 2018Openshift service broker and catalog  ocp-meetup july 2018
Openshift service broker and catalog ocp-meetup july 2018
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the Database
 
Red Hat Storage Roadmap
Red Hat Storage RoadmapRed Hat Storage Roadmap
Red Hat Storage Roadmap
 
Red Hat Storage Roadmap
Red Hat Storage RoadmapRed Hat Storage Roadmap
Red Hat Storage Roadmap
 
Enterprise Integration Patterns with Apache Camel
Enterprise Integration Patterns with Apache CamelEnterprise Integration Patterns with Apache Camel
Enterprise Integration Patterns with Apache Camel
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's Guide
 
Eclipse Apricot
Eclipse ApricotEclipse Apricot
Eclipse Apricot
 
Kotlin REST & GraphQL API
Kotlin REST & GraphQL APIKotlin REST & GraphQL API
Kotlin REST & GraphQL API
 
Introducing Apricot, The Eclipse Content Management Platform
Introducing Apricot, The Eclipse Content Management PlatformIntroducing Apricot, The Eclipse Content Management Platform
Introducing Apricot, The Eclipse Content Management Platform
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDB
 
Modern application development with oracle cloud sangam17
Modern application development with oracle cloud sangam17Modern application development with oracle cloud sangam17
Modern application development with oracle cloud sangam17
 
Design Summit - Technology Vision - Oleg Barenboim and Jason Frey
Design Summit - Technology Vision - Oleg Barenboim and Jason FreyDesign Summit - Technology Vision - Oleg Barenboim and Jason Frey
Design Summit - Technology Vision - Oleg Barenboim and Jason Frey
 

More from Piergiorgio Lucidi

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationPiergiorgio Lucidi
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Piergiorgio Lucidi
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StoryPiergiorgio Lucidi
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesPiergiorgio Lucidi
 
Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFPiergiorgio Lucidi
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyPiergiorgio Lucidi
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesPiergiorgio Lucidi
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web ScriptsPiergiorgio Lucidi
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesensePiergiorgio Lucidi
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankPiergiorgio Lucidi
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomePiergiorgio Lucidi
 

More from Piergiorgio Lucidi (14)

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital Transformation
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success Story
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process Services
 
Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCF
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's Successes
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web Scripts
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - Sourcesense
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European Bank
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
Hippo CMS - A first look
Hippo CMS - A first lookHippo CMS - A first look
Hippo CMS - A first look
 
Spring Ldap
Spring LdapSpring Ldap
Spring Ldap
 
Spring In Alfresco Ecm
Spring In Alfresco EcmSpring In Alfresco Ecm
Spring In Alfresco Ecm
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Apache ManifoldCF

  • 2. Overview ● The story ● What is ManifoldCF? ● Why ManifoldCF? ● Architecture ● The 0.3-incubating version ● The 0.4-incubating version ● What's new in the 0.5-incubating ● The book: ManifoldCF in Action ● Demo ● Resources
  • 3. The story The original ManifoldCF code base was granted by MetaCarta Inc., to the Apache Software Foundation in December 2009. The MetaCarta effort represented more than five years of successful development and testing in multiple, challenging enterprise environments. The project is in the Apache Incubator because the community was not yet diverse enough, but now the project is towards graduation. ^__^
  • 4. What is ManifoldCF? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers
  • 5. What is ManifoldCF? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers ● Out-Of-The-Box it is distributed as J2EE web apps ○ REST API ○ Authority Service ○ Crawler UI ● Can be embedded in any Java application
  • 6. Why ManifoldCF? ● Reliability ● Incremental ● Multi repositories ● Security model ● Monitoring
  • 7. Why ManifoldCF? - Reliability Jobs scheduling and configuration are stored in the database to maintain the state of all the executions
  • 8. Why ManifoldCF? - Incremental Jobs can be optionally configured to re-visit contents incrementally
  • 9. Why ManifoldCF? - Multi repositories Jobs can retrieve contents from the following repositories: ● CMIS-compliant ● Alfresco ● IBM FileNet ● EMC Documentum ● Microsoft SharePoint ● OpenText LiveLink ● Autonomy Meridio ● Memex Patriarch ● Windows Share/DFS ● Generic JDBC ● Generic Filesystem ● Generic RSS and Web
  • 10. Why ManifoldCF? - Multi repositories Jobs can ingest contents to the following search servers: ● ElasticSearch ● OpenSearchServer ● Apache Solr ● MetaCarta GTS
  • 11. Why ManifoldCF? - Security model Retrieve per-content ACLs
  • 12. Why ManifoldCF? - Monitoring UI Crawler allows you to: ● configure jobs and connectors ● monitor jobs execution ● monitor contents ingestion ○ status reports ■ document status ■ queue status ○ history reports ■ simple history ■ maximum activity ■ maximum bandwidth ■ result histogram
  • 13. Architecture ● Pull Agent Daemon ○ Jobs ■ Repository Connectors ■ Output Connectors ■ Authority Connectors
  • 14. Architecture ● Pull Agent Daemon (the core service) ○ Jobs (execute the ingestion tasks) ■ Repository Connectors (retrieve contents) ■ Output Connectors (ingest contents) ■ Authority Connectors (retrieve ACLs)
  • 16. Architecture - Job A job is an ingestion work that consists of: ○ verbal description ○ repository connection ■ authority connection (optional) ○ metadata mapping ○ output connection (search server) ○ crawling model ○ scheduling information (on demand or time ranges)
  • 18. The 0.3-incubating version ● CMIS Repository Connector ● OpenSearchServer Output Connector ● Scripting Language ● New Maven build process ● Several bug fixes
  • 19. The 0.4-incubating version ● Alfresco Connector ● JDBC Connector now supports MySQL ● CMIS Connector upgraded to OpenCMIS 0.5.0 ● Several bug fixes
  • 20. What's new in the 0.5-incubating ● Apache Velocity for connectors UI templates ● ElasticSearch Output Connector ● CMIS Connector upgraded to OpenCMIS 0.6.0 ● Prebuild connector support: just add jars and go! ● New Japanese localization ● Several bug fixes
  • 21. The book: ManifoldCF in Action ManifoldCF in Action by Karl Wright published by Manning Karl is the original developer and the principal committer of Apache ManifoldCF The book is available at the following site: http://www.manning.com/wright
  • 22. DEMO
  • 24. Thank you for your attention!