3. Who we are
⢠Company started in 2000
⢠2002-2005: Zope-based CPS project
⢠2005: First Eclipse RCP based project
⢠2006-now: Full switch to Java (Java EE 5 and
OSGi)
⢠2009-now: Business model migration from
service company to an OSS Software Vendor
4. What is ECM?
ECM, a concept that emerged in the early
2000s, represents the integrated
enterprise-wide management of all
forms of non-structured (and
sometimes, semi-structured) content,
including their metadata, across their
whole lifecycle, supported by appropriate
technologies and administrative
infrastructure.
6. What are CEVA?
⢠4LA invented by Gartner in 2006: âContent
Enabled Vertical Applicationsâ
⢠âCEVAs typically help to automate complex
processes that previously required workers
to manually sort through paper documents
and other forms of content (in effect, a way
to manage down costs of exception handling)
and optimize the remainder of the work.â
7. Business Goals
⢠First, create a MVP (minimal viable
product) to ensure company sustainability
⢠Base it on a clean, extensible
architecture
⢠With the end goal of enabling the creation
of a rich ecosystem of extensions and
application proďŹles
8. Nuxeo CPS
⢠Content management and portal platform
⢠Developed from 2002 to 2005
⢠Built on top of the Zope and CMF
(Content Management Framework) open
source frameworks
⢠Architecture: pluggable components
(âProductsâ) and events
9. Switch to Java: Why?
⢠Technical reasons:
⢠ZODB doesnât scale well in terms of data
volume
⢠Dymanic languages donât scale well in terms of
managing complexity (> 100 KLOC)
⢠Business reasons:
⢠Java makes it much easier to work with
mainstream systems integrators
10. A Few Numbers
⢠Nuxeo EP+DM is a 400 KLOC Java project
⢠Comprises ~190 independent modules (JARs)
⢠Developed over the last 4 1/2 years by a core
team of 20 developers and 50 community
contributors
⢠Has generated ~20 MEUR of revenue for
Nuxeo, ~50 MEUR for partners
12. Business Vision
⢠Address the full ECM scope
⢠Initial focus on Document Management
⢠Architecture must be extensible and
modular
⢠Enable and sustain the Ecosystem
⢠Easy to work with, designed for
participation
13. Business Vision
⢠Low barrier of entry for:
⢠End-users (e.g. pleasant UI)
⢠Developers (e.g. clean model and API,
leverage existing knowledge)
⢠Sysadmins / operations
⢠âEnterprise-classâ software
⢠10 000s of users, millions of documents
14. Our Original Roadmap
⢠Don't reinvent the wheel
⢠Leverage existing standards, work on new
ones (ex: JCR2, CMIS)
⢠Build on proven open source
libraries(JBoss, Apache, Sun, Eclipse)
⢠Use a robust software engineering process
⢠Make it transparent for our community
15. Core ECM
⢠Document types deďŹnition and management
⢠Storage of the documents and associated
metadata
⢠Document life cycle and versioning
⢠Access control
⢠Indexing + query language, must enable complex
queries on both full-text and metadata
16. Higher-Level
ECM Services
⢠WorkďŹow
⢠Transformation and rendering
⢠User management
⢠User interface
⢠A rich set of HTTP-based APIs exposed to
third-party developers and integrators (WS-*
and REST)
17. 2.0 and 3.0 (Ongoing)
⢠Tagging and folksonomies
⢠Lightweight collab (wikis) and publishing (blogs)
⢠Social networking (âfriendingâ or âfollowingâ
colleagues or business partners, user timelines)
⢠Collaborative ďŹltering
⢠Mobile and disconnected access
⢠Semantic content categorization and named entities
extractions
19. Nuxeo ECM - Our Approach
Applications
Correspondence Contracts Invoice Marketing Asset
Management Management Processing Management
Business
Solutions
Construction Media Government Life Sciences
Digital Case
Horizontal Document Records Content
Asset Management
Packages Management Management Aggregator
Management Framework
Nuxeo Enterprise Platform
Platform: Complete set of components covering all aspects of ECM
Content
Infrastructure Nuxeo Core
Lightweight, scalable, embeddable content repository
19
24. The Strongest
Requirement
⢠Applications (horizontal, vertical or custom)
must be buildable just by assembling
components (packaged as Java JARs)
⢠Architecture must allow behavior
modiďŹcation at the repository level (e.g. new
document type), at the UI level (e.g. new
actions), and at the service level (e.g. adding
new services) without recompilation
26. Standards Choice
⢠Switch to Java was motivated by the desire
to be more âstandards-compliantâ
⢠But the problem with standards, is that
there are too many to choose from!
⢠Old vs. new or emerging
⢠Open standards vs. de facto standards
⢠Overlapping standards (hardest issue!)
27. Initial Standards
⢠Java EE 5, as the structuring general
framework for the server-based application
(but not for the core services)
⢠OSGi, as a packaging model for
components
⢠The JCR (Java Content Repository), as the
model API to manage content and metadata
at the most basic level
28. Initial Standards
⢠JSF as the presentation layer (part of Java
EE 5)
⢠JBoss Seam, a web presentation framework
that extends JSF, because we felt would
provide a much improved developer
experience over the âpure Java EE 5â model
29. Notes
⢠Java EE 5 was really new and still âwetâ at
the time
⢠Seam was not a standard, but its concepts
eventually merged into one (JCDI)
⢠In 2006 OSGi had credibility in the
embedded and rich client spaces, not yet on
the server
⢠We dropped JCR support in 2010
30. Open Source Libraries
⢠The Open Source Java ecosystem started
to grow in the late 90s (Apache) and had a
huge boost in the early 00s (Eclipse, JBoss,
OW2, etc.)
⢠Like with standards, there are usually many
OSS implementations to choose from
⢠FYI: Nuxeo EP now embeds more that 200
external open source libraries!
31. Choosing an OSS Library
⢠License compatibility with the LGPL (this
excludes proprietary and GPL licenses)
⢠Compliance to a chosen standard
⢠Quality, as witnessed by visual inspection of
the source code
⢠ConďŹdence in the development process (e.g.
are there unit tests?) and the community
behind the project
32. BeneďŹts and Challenges
of Using OSS Libraries
⢠With OSS, itâs easier to evaluate options
⢠Forking a library is sometimes the only way to
ďŹx a bug or add a missing functionality
⢠But it comes with a tremendous price because
now you have to maintain your own branch
⢠Becoming a contributor is also sometimes
needed, but comes at a price too
⢠Risk of âJAR hellâ (conďŹicting libraries reqs)
33. Lessons Learned
⢠Allow users of our platform to extend it
without touching its source code
⢠Or, even better, without writing code at all!
⢠Keep your options open, but donât over-
engineer ďŹexibility
36. Layer Cake
Nuxeo EP Architecture
Nuxeo UI Frameworks
Flexible choice of interfaces
Nuxeo ECM Services
Modular set of content services
Nuxeo Core
Advanced content repository
Nuxeo Runtime
Component and service model
39. OSGi in Theory
⢠OSGi is a component system developed
initially for the embedded systems industry
⢠Adopted by Eclipse for Eclipse 3.0 (2005)
⢠Both a module system and service platform
(but weâre currently only using the former)
⢠Modules, or âbundlesâ are just JARs with a
special MANIFEST
40. OSGi in Theory
⢠An OSGi âcontainerâ takes care of
component activation
⢠Bundles describe their own imports
(dependencies) and exports (exposed API)
⢠Container can also take care of provisioning
⢠Class loader isolation can take care of âJAR
Hellâ
41. OSGi at Nuxeo
⢠We package our components as OSGi bundles
⢠We have our own âOSGi-likeâ adapter for app
servers (JBoss, Jetty, Tomcat, GlassďŹsh)
⢠Most of our components can also run on
Eclipse Equinox (for RCP apps)
⢠We have our own service registry, but itâs
currently not based on OSGi
⢠We donât provide class loader isolation
42. OSGi at Nuxeo
⢠Goal is to be able to run everything on a
ârealâ OSGi container in 2011
⢠... and to fully leverage the OSGi service
stack at the same time
⢠Including service registry, hot-reload, class
isolation, etc.
⢠Biggest conceptual issue: overlap with Java
EE
43. Plugins and
Extension Points
⢠Inspired by the Eclipse architecture
⢠Eclipse = a core runtime engine + a set of plugins
⢠Plugin: the smallest extensible unit to contribute
additional functions to the system
⢠Extension points: boundaries between plug-ins
⢠A plugin (bundle) can contribute either
conďŹguration (pure XML contribution) or code
(XML + Java)
45. Note
⢠This âcore + extensionsâ pattern is very
common in successful open source projects
⢠Linux kernel + drivers (modules)
⢠Firefox + plugins
⢠Emacs + Emacs LISP macros
⢠Itâs a key to enabling an architecture of
participation
46. Event Bus
⢠EventHandlers aka listeners
⢠Synchronous / PostCommit / Asynchronous
⢠Easily contributed (Java / script / MDB)
⢠Great solution for
⢠Glueing together independent components
⢠Enforcing business rules (synchronous inline)
⢠Pushing/getting data to/from external systems
49. Goals
⢠Must enable the participation of third party
contributors (partners, community)
⢠Must improve synchronization between
custom developments and OSS projects
⢠Agile development practices (XP, TDD)
already used at Nuxeo since 2001 or so
⢠Must complement them with simple, efďŹcient
and scalable project management practices
56. âEvery successful open source project I know
uses PRIM. Every closed source project I
know, doesn't. People wonder how open source
projects manage to create high-quality products
without managers or accountability. The answer:
we're accountable to our infrastructure. PRIM
is the open source secret sauce.â
Ted Husted http://jroller.com/TedHusted/entry/prim
60. More Tools
⢠IDEs (Eclipse mostly)
⢠Testing (JUnit, Selenium, WebDriver)
⢠Static code analysis (FindBugs, IDEA
inspections, Checkstyle, Enerjy)
⢠Various proďŹlers and debuggers
61. Outstanding Issues
⢠CI at our level is very resource-intensive (10
servers farm)
⢠Itâs hard (read, impossible) to test UI without
a browser
⢠OTOH âPlainâ Selenium test are hard to
maintain
⢠Some pieces (Maven repository, Selenium
testing) are fragile, and introduce heisenbugs
63. Open Source, Java-Based
⢠Some Java-baed open source WCM or E2.0
platforms (XWiki, Jahia...) have developed ad-
hoc component systems similar to ours
⢠Alfresco is an ECM solution with a static
architecture based on Spring, which makes it
harder to customize and extend
⢠Apache Sling is a framework based on JCR and
OSGi, but doesnât come with complete
solutions and seems more focussed on WCM
64. Open Source, Scripting
Languages Based
⢠Drupal, Joomla, WordPress (PHP), Plone
(Python), are examples of extensible
content management platforms based on
scripting languages with large ecosystems
⢠Plugins usually rely on callbacks functions /
methods instead of ext. points and events
⢠Plugins can break due to API changes and
lack of statical veriďŹcation of compatibility
65. Proprietary
⢠Proprietary software is, by deďŹnition,
harder to study and tear apart
⢠But the general view is that big name
vendors products (Documentum, Open
Text, FileNet) are based on mixes of old
technologies patched together after various
acquisitions, and harder to make evolve and
to program to for modern developers
67. Ongoing Work
⢠Simplifying the developer experience: faster
code/test turnaround, simpler web front end
development using the more recent JAX-RS
standard or Google Web Toolkit (GWT).
⢠Development of business-oriented RESTful
APIs to allow high-level interaction with the
content, and eventually business application
development by non-technical users (cf.
Nuxeo Studio).
69. Ongoing Work
⢠Replication, both in a LAN (for scalability
and fault tolerance) and WAN (for
replication between remote data centers,
or between a server and a desktop or
mobile client) contexts
⢠Social integration (using OpenSocial)
⢠Further work on semantic technologies
70. Ongoing Work
⢠Porting the platform to the Cloud, and
exposing it as a PaaS service
⢠Work on mobile ECM, with clients for
platforms such as the iPhone/iPad, Android
and Blackberry operating systems
⢠Bringing Nuxeo EP to Java EE 6 and full
OSGi compliance
72. Key Findings
⢠The Nuxeo EP architecture ďŹts both the OSS
âarchitecture of participationâ vision and our
business model and goals
⢠Main effort has now moved from the platform
to its periphery (extensions, applications,
development and operation tools), as enabled by
the architecture
⢠Still work to do on some key standards
compliance aspects (OSGi, Java EE 6, CMIS...)